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This  paper  concerns  two  Important  Issues  in  the  design  of  optimal  languages 
for  direct  execution  in  an  interpretive  system:  binding  the  operand  identifiers 
in  an  executable  instruction  unit  to  the  arguments  of  the  routine  implementing 
the  operator  defined  by  that  instruction;  and  binding  operand  identifiers  to 
execution  variables.  These  Issues  are  central  to  the  performance  of  a system, 
both  in  space  and  time. 

Historically,  some  form  of  "machine  language"  is  used  as  the  directly 
executable  medium  for  a computing  system.  These  languages  traditionally  are 
constrained  to  a single  "n-address"  instruction  format ; this  leads  to  an  excessive 
number  of  "overhead"  instrudtions  that  do  nothing  but  move  values  from  one  storage 
resource  to  another  being  Imbedded  in  the  executable  instruction  stream.  We 
propose  to  reduce  this  overhead  by  increasing  the  number  of  instruction  formats 
available  at  the  directly  executed  language  level. 

Machine  languages  are  also  constricted  with  respect  to  the  manner  in  which 
operands  can  be  "addressed"  within  an  instruction.  Usually,  some  form  of  indexed 
base-register  scheme  is  available,  along  with  a direct  addressing  mechanism  for 
a few,  "special"  storage  cells  (l.e.,  registers,  and  perhaps  the  zeroth  page  of 
main  store).  We  propose  a different  identification  mechanism — based  on  the  Contour 
Model  of  Johnston.  Using  our  scheme,  only  N bits  are  needed  to  encode  any 
Identifier  in  a scope  containing  less  than  2**N  distinct  identifiers. 

Together,  these  two  results  lead  to  directly  executed  language  designs  which 
are  optimal  in  the  sense  that:  (1)  k executable  instructions  are  required  to 


implement  a source  statement  containing  k functional  operators;  (2)  the  space 
required  to  represent  the  executable  form  of  a source  statement  contining  k 
distinct  functional  operators  and  v distinct  variables  approaches  F*k  + N*v  — 
where  there  are  less  than  2**F  distinct  functional  operators  in  the  scope  of 
definition  for  the  source  statement,  and  less  than  2**N  distinct  variables  in 
this  scope.  (3)  the  time  needed  to  execute  the  representation  of  a source 
statement  containing  k functional  operators,  d distinct  variables  in  its  domain, 
and  r distinct  variables  in  its  range  approaches  d + r + k;  where  time  is 
measured  in  memory  references. 


The  work  described  herein  was  supported  in  part  by  the  Department  of  Energy 
under  contract  no.  EY-76-03-0326-PA  39  and  the  Army  Research  Of f ice-Durham 
under  contract  no.  DAAG-29-76-G-0001. 
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1.  INTRODUCTION 

This  report  addresses  the  problem  of  representing  programs  for 

direct  machine  interpretation.  The  obvious  inadequacies  of  present 

machine  architectures,  in  terms  of  program  size  and  execution  time, 

are  well  known^.  Less  obvious  secondary  effects  have  led  to  compll- 

2 

cated,  even  Byzantine  system  structures  and  implementations  . We  con- 
tend that  this  is  due  to  the  fact  that  traditional  systems  are  based 
on  the  premise  that  the  executable  machine  architecture  must  be  a 
fixed  and  hence  universal  language.  The  central  thesis  of  this 
research  is  that  having  to  represent  programs  in  a language  that  is 
fixed,  ^ priori  with  respect  to  system  design,  forces  interpretation 
to  occur  at  too  low  a level,  places  too  great  a burden  on  the  transla- 
tion, and  limits  the  potential  efficiency  of  a system. 

It  is  assumed  that  programs  are  initially  expressed  in  a higher 
level  source  language  (HLL),  which  caters  to  both  the  user  and  the 
problems  that  must  be  solved;  but  must  ultimately  be  evaluated  by  a 
much  lower  level  processor  — the  system's  host  machine.  Once  the 
source  language  and  host  machine  for  a system  have  been  selected,  the 
issue  becomes  one  of  determining  the  most  suitable  intermediate 

^C.f.,  Flynn  [6],  Green  [11],  Lawson  [19],  Lunde  [20],  Weber  [29],  and 
Wortman  [32]. 

2 

E.g.,  contemporary  compilers,  linkage  editors,  and  mechanisms  for 
recognizing  and  exploiting  parallelism  — Sethi  [25],  and  Wichman 
[30]. 
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language  (or  Instruction  set)  for  the  system  — which  we  call  its 
directly  executed  language  (DEL).  It  is  important  that  this  inter- 
mediate language  preserve  as  much  Information  concerning  the  user 
environment  and  original  soirrce  program  structure  as  is  useful  in 
^ realizing  concise  representation  and  expeditious  interpretation  (Fig- 


I 
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ure  1).  1 
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1.1.  A Hierarchial  Model 

Modelling  the  evaluation  process  is  complicated  by  the  fact  that 
a computation  is  actually  a hierarchy  of  interpretations,  each  level 
of  which  may  be  far  more  complex  than  first  apparent.  Consider  the 
sentence:  "An  algorithm  is  defined  by  a collection  of  tasks  (pro- 
grams) composed  of  higher  level  language  statements  that  are  compiled 
into  sequences  of  lower  level  instructions,  which  eventually  cause  the 
host  machine  to  undergo  a series  of  state  transitions".  This 
describes  the  five  level  hierarchy  Illustrated  below: 


Algorithm  — specifies 
Tasks  — composed  of 
HLL  Statements  — expanded  into 
DEL  Instructions  — causing 

State  Transitions  in  the  Host 


I 
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Hierarchial  Structure  of  a Problem 
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Each  level  represents  the  program  (algorithm)  in  a different  way; 
i.e.,  defines  the  same  process,  even  though  the  coding  of  Individual 
commands  is  different.  The  problem  of  representing  programs  in  an 
efficient  manner  begins  at  the  upper  most  level,  and  is  affected  by 
each  of  the  processes  involved  in  an  evaluation.  Unfortunately,  it  is 
difficult  (if  not  impossible)  to  recover  from  faulty  program  represen- 
tations at  higher  levels  through  sophistocated  Interpretation  tech- 
niques at  lower  levels.  This  is  troublesome,  since  we  would  like  to 
minimize  both  the  space  needed  to  represent  a program  and  the  time 
needed  to  interpret  it.  Hence,  while  the  significance  of  uniform  for- 
mal techniques  for  defining  ideal  program  representation  and  interpre- 
tation should  not  be  underestimated,  this  report  focuses  only  on  the 
three  lower  levels  of  the  hierarchy?  simply  assumed  that 
algorithms  are  expressed  efficiently  at  higher  levels. 

1.2.  Programs,  Instructions,  and  Computations 

At  any  level  of  the  hierarchy,  a program  may  be  defined  as  a fin- 
ite set  of  labelled  instructions  {I }.  Each  instruction  specifies  a 
pair  of  rules:  an  action  rule  A;  and  a sequencing  rule  S.  The  compu- 
tation produced  by  executing  a program  is  defined  in  terms  of  a 
sequence  of  states  where  each  state  denotes  a specific  assignment  of 
values  to  program  objects.  Each  action  rule  defines  a function  (or 


operator)  f,  which  takes  some  number  of  arguments  (dependent  on  its 
order)  and  maps  them  into  (usually)  a single  result  — arguments  and 
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results  are,  collectively,  called  operands. 


The  number  of  operands  In  an  instruction  is  fixed  and  deter- 
mined by  f,  . Action  rules  are  often  expressed  algebraically  — e.g. : 


“ ^k^*k,l’  *k,2’  •••’  *,k,n^ 


(where  n,  called  the  order  of  fj^,  is  the  number  of  arguments  required 
by  f j^) • The  number  of  different  functions  that  can  be  specified  by  an 
instruction  set  is  its  vocabulary,  or  operator  set.  In  general  pur- 
pose computers,  the  order  of  these  functions  rarely  exceeds  two,  with 
at  most  one  result  being  produced. 

Each  sequencing  rule  defines  the  successor  to  the  k^^  instruc- 
tion whenever  it  is  executed.  In  most  familiar  computer  organiza- 
tions, sequencing  is  a simple  operation  — each  instruction  having 
only  a single  successor.  However,  specific  instructions  may  require 
inspection  of  several  arguments  before  it  can  be  determined  which 
several  possible  successors  is  correct  — e.g.,  as  in  the  familiar 
conditional  branch  instruction. 


1.3.  Identifiers  and  Name  Spaces 

An  additional  aspect  of  computation  concerns  the  means  by  which 
program  objects  — the  arguments  or  result  of  action  rules  — are 
identified.  In  general,  names  are  used  as  surrogates  for  objects  — 
which  are  associated  with  specific  values  by  the  current  state  of  a 
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computation.  It  Is  useful  to  distinguish  between  the  logical  name  of 
an  object,  and  the  specific  encoding  of  that  name  appearing  In  a given 
instruction  — commonly  called  an  identifier. 

When  an  action  rule  is  applied,  the  encoded  names  within  its 
instruction  must  be  associated,  or  bound,  to  the  appropriate  program 
objects.  This  process  is  called  referencing.  The  set  of  names  for 
all  objects  referenced  during  a computation  is  called  the  process  name 
space;  the  set  of  all  identifiers  appearing  in  a program  is  called  the 
program  name  space.  It  is  important  to  distinguish  betv’een  these  two 
concepts:  the  name  space  of  a process  is  generally  data  dependent, 
and  dynamic  in  nature;  the  name  space  of  a program  is  defined  by  its 
encoding,  and  is  fully  static.  Users  relate  the  observable  but  low 
level  results  of  executing  a program  (l.e.,  the  sequence  of  host 
machine  states  produced)  to  source  level  semantics  through  a mental 
association  established  between  the  source  level  name  space  and  the 
host  name  space.  The  complexity  — and  accuracy  — of  this  mapping 


; 
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determines  the  ultimate  transparency  of  a system. 


2.  TOWARDS  IDEAL  PROGRAM  REPRESENTATIONS  [8] 


By  what  criteria  should  program  representations  be  judged? 
Clearly,  an  efficiency  measure  should  lie  In  some  sort  of  space-time 
product  Involving  both  the  space  needed  to  represent  an  executable 
program  and  the  time  needed  to  interpret  it;  although  other  factors  — 
such  as  the  space  and  time  needed  to  create  executable  representa- 
tions, or  the  space  needed  to  hold  the  Interpreter  — may  also  be 
important.  This  report  considers  only  the  space  and  time  needed  to 
represent  and  execute  a program. 

2. 1.  Canonic  Interpretive  Forms 

Characterizing  "ideal"  program  representations  can  be  either 
trivial  or  extremely  complicated,  depending  on  one's  point  of  view. 
Neither  extreme  offers  significant  insight  into  the  problems  at  hand, 
however.  It  is  therefore  imperative  to  develop  constructive  space- 
time  measures  that  can  be  used  to  explore  practical  alternatives. 
Although  these  measures  need  not  be  achievable,  they  should  be  satis- 
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programming  language. 


Property 

Instructions  — one  GIF  Instruction  Is  permitted  for  each  non- 
assignment  type  operation  In  a HLL  statement. 

3 

Name  Space  — one  GIF  name  Is  permitted  for  each  unique  HLL  name  In 
a HLL  statement. 


Log^  Property 

Instructions  — each  GIF  Instruction  consists  of: 

4 

A single  operation  Identifier  of  size  [log„(F)]  ; and  o^e  or  more 
operand  Identifiers,  each  of  which  Is  of  size  [log2(V)]  . 


Referencing  Property 

Instructions  — each  HLL  procedural  (program  control)  statement 
causes  one  canonic  reference. 

Name  Space  — one  reference  Is  allowed  for  each  unique  variable  or 
constant  In  the  HLL  statement. 


Space  Is  measured  by  the  number  of  bits  needed  to  represent  the 
static  definition  of  a program;  time  by  the  number  of  Instructions  and 
name  space  references  needed  to  Interpret  the  program.  Source  pro- 
grams to  which  these  measures  are  applied  should  themselves  be 

3 

I.e.,  distinct  name  In  the  HLL  statement;  "A  = A+1"  contains  two 

unique  names  — the  variable  "A"  and  the  constant  "1". 

4 

F Is  the  number  of  distinct  HLL  operators  In  the  scope  of  definition 
for  the  given  HLL  statement. 

Is  the  number  of  distinct  HLL  program  objects  — variables,  labels, 
constants,  etc.  — In  the  relevant  scope  of  definition 
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efficient  expressions  of  an  optimal  abstract  algorithm  — so  as  to 
eliminate  the  possible  effects  of  algorithm  optimization  during  trans- 
lation — such  as  changing  "X  = X/X"  to  "X  - 1." 

Generating  canonic  program  representations  should  be  straight 
forward  because  of  the  1:1  property.  Traditional  three  address  archi- 
tectures^ also  satisfy  the  first  part  of  this  criteria,  but  do  not 
have  the  unique  naming  property. 

For  example,  the  statement  "X  = X + X"  contains  only  one  unique 
variable,  and  hence  can  be  represented  by  a single  GIF  instruction 
consisting  of  only  one  operation  identifier  and  one  operand  identif- 
ier. The  three  address  representation  of  this  statement  also  requires 
only  a single  instruction,  but  it  would  consist  of  four  identifiers 
rather  than  the  two  required  by  the  GIF. 

There  may  be  some  confusion  as  to  what  is  meant  by  an  "opera- 

P tlon".  Functional  operators  (+,  -,  *,  /,  SQRT,  etc.)  are  clear 

If; 

p;  enough;  however,  allowance  must  also  be  made  for  selection  operators 

I that  manipulate  structured  data.  For  instance,  we  view  the  array 

i , 

specification  "A(1,J)"  as  a source  level  expression  involving  one 
. operator  (two  dimensional  qualification)  and  at  least  three  operands 

l! 

! 

^I.e.,  instruction  sets  of  the  form  ^ X — where  OP  is  an  iden- 
• tlfier  for  a (binary)  operation;  X the  left  argiunent;  Y the  right 

argument;  and  Z the  result. 


(the A,  and  Its  subscripts  I and  J).  Therefore,  unlike  the  pre- 
vious case,  the  canonic  equivalent  of  "A(I,J)  = A(I,J)  + A(I,J)" 
requires  two  Instructions  — the  first  to  select  the  proper  array  ele- 
ment, and  the  second  to  compute  the  sum.  Thus: 

Example  1:  X = X + X 
Example  2:  A(I,J)  =A(1,J)  +A(I,J) 

The  operator  computes  the  address  of  the  doubly  Indexed  element 
"A(I,J)",  and  dynamically  completes  the  definition  of  the  local  Iden- 
tifier "Ajj"*  This  Identifier  Is  then  used  In  the  same  manner  as  the 
Identifier  "X"  Is  used  In  the  first  example. 

We  count  each  source  level  procedural  operator,  such  as  IF  or  DO, 
as  a single  operator.  The  predicate  expression  of  an  IF  must,  of 
course,  be  evaluated  Independently  If  It  Is  not  a simple  variable 
••eference.  Distinct  labels  are  treated  as  distinct  operands  , so 
that : 


Example  3:  IF  (X-Y)  10,20,30 


1- 

X 

Y 

tp 

10 

20 

30  1 

Two  accesses  to  the  process  name  space  (references)  are  required 
to  execute  the  first  example:  one  to  fetch  the  value  of  X as  an  argu- 
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ment,  and  one  to  update  its  value  as  a result  of  executing  the  state- 
ment. In  example  two,  four  references  are  required:  one  each  to 
fetch  the  values  of  I and  J for  the  subscripting  operation;  one  to 
fetch  the  value  of  as  an  argument;  and  one  to  update  the  value  of 
this  array  element  after  execution.  Notte  that  no  references  are 
required  to  access  the  array  A,  even  though  It  appears  as  an  operand 
of  the  @ function  — In  general,  no  single  Identifier  In  a GIF 
Instruction  can  cause  more  than  one  reference  unless  It  Is  bound  to 
both  an  argument  and  a result,  and  then  It  will  Initiate  only  two 
references.  No  references  are  needed  for  either  example  just  to  main- 
tain the  Instruction  stream,  since  the  order  of  execution  Is  entirely 
linear^.  The  1:1  property  measures  both  space  and  time,  while  the 
log2  property  measures  space  alone,  and  the  referencing  property  meas- 
ures time  alone.  These  measures  may  be  applied  either  statically  or 
dynamically  — although  static  reference  counts  are  strictly  compara- 
tive, and  hence  of  limited  value. 

The  1:1  property  defines.  In  part,  a notion  of  transformational 
completeness  — a term  which  we  use  to  describe  any  Intermediate 
language  satisfying  the  first  canonic  measure.  Translation  of  source 
programs  Into  a transformationally  complete  language  should  require 
neither  the  Introduction  of  synthetic  variables,  nor  the  Insertion  of 

\he  assumption  here  Is  that  such  reference  activity  can  be  fully 
overlapped  since  It  is  so  predictable. 
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non-functlonal  memory  oriented  instructions  . However,  since  the 
canonic  measures  described  above  make  no  allowance  for  distinguishing 
between  different  associations  of  identifiers  to  arguments  and 
results,  it  is  unlikely  that  any  practical  language  will  be  able  to 
fully  satisfy  the  CIF  space  requirements. 

2.2.  Comparison  of  CIF  to  Traditional  Machine  Architectures 

Consider  the  following  three  line  excerpt  from  a FORTRAN 
subroutine: 


1 1=1  + 1 

2 J = (J-1)*I 

3 K = (J-1)*(K-I) 

Assume  that  I,  J,  and  K are  fullword  (32  bit)  integers  whose  initial 
values  are  stored  in  memory  prior  to  entering  the  excerpt,  and  whose 
final  values  must  be  stored  in  memory  for  later  use  before  leaving  the 
excerpt.  The  canonic  measures  for  this  example  are: 


g 

E.g.,  to  hold  the  results  of  intermediate  computations,  or  move  data 
about  within  the  storage  hierarchy  merely  to  make  it  accessable  to 
functional  operators. 


* 
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CANNON IC  MEASURE  OF  THE  FORTRAN  FRAOIENT 


Instructions 


Statement  1 — 
Statement  2 — 
Statement  3 — 

Total 


1 instruction 

2 instructions 

3 instructions 


(1  operator) 
(2  operators) 
(3  operators) 


6 instructions  (6  operators) 


Instruction  Size 


Identifier  Size 


Operation  Identifier  size  = riog2  4l  - 2 bits 
(operations  are:  +,  *,  «) 

Operand  Identifier  size  - riog2  4l  - 2 bits 
(operands  are:  1,  I,  J,  K) 

Number  of  Identifiers 

Statement  1 — 3 identifiers  (2  operand,  1 operator) 

Statement  2 — 5 identifiers  (3  operand,  2 operator) 

Statement  3 — 7 Identifiers  (4  operand,  3 operator) 


Total 


15  identifiers  (9  operand,  6 operator) 


Program  Size 


6 operator  identifiers  x 2 bits  12  bits 
9 operand  identifiers  x 2 bits  = 18  bits 


Total 


30  bits 


References 


Instruction  Stream  - 
Operand  Loads 
Operand  Stores 


- 1 reference  (nominal) 

- 9 references 

- 3 references 


Total 


13  references 
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The  following  listing  was  produced  on  an  IBM  System  370  using  an 
9 

optimizing  compiler  : 


1 L 10,112(0,13) 

L 11,80(0,13) 

LR  3,11 

A 3,0(0,10) 

ST  3,0(10) 

2 L 7,4(0,10) 

SR  7,11 

MR  6,3 

ST  7,4(0,10) 

3 LR  4,7 

SR  4,3 

LCR  3,3 

A 3,8(0,10) 

MR  2,4 

ST  3,8(0,10) 


A total  of  368  bits  are  required  to  contain  this  program  body  (we  have 
excluded  some  2000  bits  of  prologue/epilogue  code  required  by  the  370 
Operating  System  and  FORTRAN  linkage  conventions)  — over  12  times  the 
space  indicated  by  the  canonic  measure.  Computing  reference  activity 
in  the  same  way  as  before,  we  find  48  accesses  to  the  process  name 
space  are  required  to  evaluate  the  370  representation  of  the  FORTRAN 
excerpt.  If  allowance  is  made  for  the  fact  that  register  accesses 
consume  almost  no  time  in  comparison  to  accesses  to  the  execution 
store,  this  count  drops  to  20  references  — allowing  one  access  for 


^FORTRAN  IV  level  H,  OPT  - 2,  run  in  a 500K  partition  on  a Model  168, 
June  1977. 
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each  32  bit  word  In  the  instruction  stream. 

The  Increase  in  program  size,  number  of  instructions,  and  number 
of  memory  references  is  a direct  result  of  the  partitioned  name  space, 
indirect  operand  identification,  and  restricted  instruction  formats  of 
the  370  architecture.  In  order  to  facilitate  the  discussion  at  this 
point,  it  is  useful  to  define  [6]  three  general  classes  of  instruc- 
tions : 


M-lnstructions,  which  simply  move  data  items  within  the  storage 
hierarchy  (e.g.,  the  familiar  LOAD  and  STORE  operators); 

P-inst ructions,  which  modify  the  default  sequencing  between  instruc- 
tions during  execution  (e.g.,  JUMP,  BRANCH  and  LINK  operators); 
and 

F-instructions.  which  actually  perform  functional  computations  by 
assigning  new  values  to  result  operands  after  transforming  the 
current  values  of  argument  operands  (e.g.,  all  arithmetic,  logi- 
cal, and  shifting  operators). 

Instructions  that  merely  rearrange  data  accross  partitions  of  a 
memory  name  space,  or  that  alter  the  normal  order  of  Instruction 
sequencing,  are  "overhead"  in  the  sense  that  they  do  not  directly  con- 
tribute to  a computation.  The  ratio  of  these  overhead  instructions 
(l.e.,  M-  and  P-  type  instructions  in  our  terminology)  to  functional 
instructions  (F-instructions)  is  indicative  of  the  use  of  an  architec- 
ture. Overhead  instructions  must  be  inserted  into  the  desired 
sequence  of  F-lnstructions  to  match  the  computational  requirements  of 
the  original  program  to  the  capabilities  of  the  machine  architecture. 
Statically,  M-instructlons  are  by  far  the  most  common  overhead 
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instructions  — indeed,  they  are  the  most  common  type  of  instruction 
in  almost  all  existing  machines.  Dynamically,  however,  P-instructions 
become  equally  significant. 

The  table  below  Illustrates  the  use  of  ratios  for  the  foregoing 
example. 

CCWPARISON  FOR  THE  EXAMPLE 

370  FORTRAN -IV 

(level  H extended)  GIF 

optimized  non  optimized 

No.  of  Instructions  15  19  6 

M-type  Instructions  9 13  0 

F-type  Instructions  6 66 

M-ratio  1.5  2.7  0 

Program  Size  368  bits  604  bits  30  bits 

Memory  References  20  36  13 
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3.  DEL  SYNTHESIS 


This  section  addresses  the  problem  of  designing  high  performance 
del's.  We  focus  on  three  particular  areas: 


Sequencing,  which  has  two  aspects  — 

a.  Sequencing  between  actions  (program  control). 

b.  Sequencing  within  an  action  (context). 


Action  Rules , which  also  have  two  aspects  — 

a.  The  format  or  transformation  used  by  the  rule. 

b.  The  operation  Invoked. 


Name  Space . which  addresses  two  issues  — 

a.  Name  structure  — the  syntax  and  semantics  of  identifiers. 

b.  Name  environment  — referencing  of  variables  and  opera- 
tors. 


Each  of  these  areas  will  be  reviewed  following  a statement  of 
term  definition  and  assumptions. 


3. 1.  Terms  and  Assumptions 

In  order  to  synthesize  simple  "quasi-ideal"  DELs,  let  us  make 
some  obvious  assignments  and  assumptions. 


* The  DEL  program  representation  lies  in  the  main  storage  of  the 
host  machine 

* The  Interpreter  for  the  DEL  lies  in  a somewhat  faster,  smaller 
interpretive  storage.  The  Interpreter  includes  the  actual  inter- 
pretive subroutines  as  well  as  certain  parameters  associated  with 
interpretation. 
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Only  a small  number  of  registers  exist  in  the  host  machine  that 
can  be  used  to  contain  local  and  environmental  information  associ- 
ated with  the  interpretation  of  the  current  DEL  instruction. 
Further,  it  is  assumed  that  communications  between  interpretive 
strorage  and  this  register  set  can  be  overlapped  (Figure  2(a)). 


MICRO  STORE 

MAIN  MEMORY 

INSTRUCTION  FUNCTION  & PROGRAM 

ENVIRONMENT  SCOPE  ENVIRONMENT 


Figure  2(a):  DEL/Host  Storage  Assignment 
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An  Instruction  is  a binary  string  partitioned  Into  Identifiers 
under  action  of  the  interpretive  program.  An  Identifier  Is  an  element 
of  the  vector  bit  string  specifying  one  of  the  following: 

I.  format  and  (Implicitly)  the  number  of  operands 

II.  the  operands 

III.  operations  to  be  performed  (of  at  most  binary  order)  on  the 
Identified  operands 

Iv.  sequencing  information,  if  required. 

A format  is  a rule  defining: 

I.  the  Instruction  partition  (i.e.  number  and  meaning  of  iden- 
tifiers). 

II.  the  order  of  the  operation  (i.e.,  whether  the  operation  is 
In  nullary,  unary  or  binary). 

ill.  precedence  among  operands  (I.e.,  binding  of  operand  identif- 
iers to  functional  operands). 

In  this  report,  it  is  assumed  that  DEL  instructions  are  use 
ordered  — i.e.,  that  the  Internal  sequence  of  identifiers  within  an 
instruction  is  the  same  as  the  sequence  in  which  these  Identifiers 
will  be  required  during  interpretation.  The  370  architecture  is  not 
use  ordered,  since  the  format/operation  code  appears  before  operand 
identifier  Information.  This  forces  the  interpreter  to  "save"  the 
operation  code  during  computation  of  effective  addresses  — wasting, 
at  least  temporarily,  a scarce  host  register. 

The  size  of  an  identifier  is  the  width  of  the  field  it  occupies 
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within  an  instruction 


It  is  determined  by  the  number  of  elements 
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required  in  a locality;  the  structure  of  a typical  DEL  instruction  is 
illustrated  in  Figure  2(b). 


OPERATION  IDENTIFIER 


FORMAT 

A 

B 

c 

OP 

A 

OPERAND  IDENTIFIERS 
INTERFACE  IDENTIFIER 


Figure  2(b):  Layout  of  a Typical  DEL  Instruction 

3.2.  Sequencing  Rule 

Usually,  a program  consists  of  a sequence  of  action  rules.  The 
sequeclng  rule  provides  the  ordering  relation  among  the  action  rules 
— i.e.,  it  defines  the  sequence  of  the  action.  While  it  is  possible 
to  conceive  of  DEL's  with  unordered  action  rules  (no  sequence  rule), 
this  form  is  of  little  value. 

3.2.1.  Sequencing  Between  Actions 

In  practice  only  a few  sequencing  rules  have  been  used  with  any 
degree  of  success.  We  consider  the  following  three  rules: 
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Linear : individual  instructions  are  stored  in  a one  dimensional  array 
within  the  main  store.  Execution  order  is  the  same  as  the  array 
ordering  unless  modified  by  a branch  Instruction. 

Binary  Tree ; instructions  are  mapped  into  the  nodes  of  a tree  struc- 
ture maintained  in  main  store.  Leaf  nodes  normally  correspond  to 
data  references;  ancestor  nodes  to  semantic  functions.  A standard 
traversal  algorithm  defines  the  default  order  of  execution,  which 
can  be  modified  by  visiting  a branch  node. 

Linked  List ; instructions  are  stored  at  the  links  in  a chain  structure 
maintained  in  main  store.  The  default  execution  order  is  again 
specified  by  a traversal  algorithm,  and  can  be  modified  by  the 
semantics  associated  with  the  most  recently  visited  link. 

These  three  forms  are  abstracted  from  well  known  programming 
structures.  Most  traditional  machine  language  DELs  are  based  on  a 
linear  form.  Tree  form  are  widely  used  as  Intermediate  data  struc- 
tures by  compilers.  Linked  lists  are  the  fundamental  program  and  data 
structures  for  LISP  and  PPL  (McCarthy  [21],  and  Standish  [26]).  Tree 
and  list  data  structures  are  widely  used  in  the  algorithms  employed  in 
artificial  Intelligence  and  information  retrieval  applications.  Fig- 
ure 3 illustrates  program  representations  in  the  linear,  tree,  and 
list  forms. 

The  particular  DEL  organization  used  in  these  examples  is  arbi- 
trary, for  purposes  of  illustration  only,  and  is  not  necessarily 
optimal.  Similarly,  neither  the  operators  nor  data  structures  are 
completely  specified;  they  should  be  assumed  to  have  the  same  general 
interpretation  for  all  three  DEL  forms.  These  fragments  are  con- 
structed so  that  the  order  of  execution  will  be  identical  (i.e.,  the 
sequence  of  functional  operations  and  storage  accesses  will  be  the 
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same) . 

Figure  3:  Three  Representations  of"I“J*(K+L); 


(a)  — Linear  push  @1 
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* (multiply) 
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3.2. 1.1.  Linear  Forms: 

The  sequencing  rule  for  a DEL  governs  the  way  in  which  control  is 
passed  from  one  Instruction  to  another.  If  a linear  form  is  used,  for 
example,  the  normal  sequence  of  execution  is  implied  by  the  placement 
of  DEL  instructions  within  the  main  store.  A program  counter  is  usu- 
ally maintained  within  the  interpreter,  as  part  of  the  DEL  program 
status  vector,  which  points  to  the  word  containing  the  next  DEL 
Instruction  to  be  executed.  When  the  contents  of  the  current  instruc- 
tion word  are  interpreted,  the  word  pointed  to  by  the  program  counter 
is  fetched,  the  counter  Incremented  appropriately,  and  execution  con- 
tinues. Interpreting  a branch  instruction  causes  the  DEL  program 
counter  to  be  loaded  with  a new  address  that  points  to  the  next 
instruction  to  be  executed.  The  set  of  branching  instructions  in  a 
DEL  is  not  confined  to  the  simple  GOTO,  but  may  also  Include  more  com- 
plex program  control  operators  such  as  CALL,  RETURN,  DO,  and  IF-THEN- 
ELSE. 


Since  the  default  sequencing  rule  for  a linear  DEL  is  to  simply 
process  the  instruction  stored  "immediately  after"  the  one  just  exe- 
cuted, there  is  a good  match  between  this  form  and  cyclically  address- 
able main  stores.  This  can  be  exploited  by  carefully  packing  DEL 
instructions  so  that  the  essential  fetch  and  sequence  steps  within  the 
basic  cycle  of  interpretation  can  be  implemented  efficiently.  This 
can  almost  always  be  achieved  with  minimal  execution  time  overhead 


J 


24 

using  only  elementary  shift  and  increment  capabilities. 

The  natural  ordering  of  addressable  storage  cells  can  be  used  to 
induce  a default  order  of  interpretation,  thus  eliminating  the  need 
for  explicit  sequencing  of  pointers  in  linear  segments  of  DEL  code. 
As  Individual  instructions  are  more  highly  compressed,  fewer  main 
store  accesses  are  required  to  maintain  a given  DEL  instruction 
stream.  For  example,  suppose  that  each  Instruction  in  a linear  DEL 
contains  the  address  of  its  successor  as  an  explicit  subfield.  An 
Interpreter  would  sequence  through  Instructions  by  fetching  the  suc- 
cessor address  from  the  Instruction  just  executed,  and  then  obtaining 
the  next  instruction  to  be  executed  from  that  address  in  main  store. 
No  internal  program  counter  need  be  maintained  unless  relative  branch- 
ing is  required. 

This  DEL  could  be  made  more  efficient  by  eliminating  explicit 
successor  addresses  within  instructions  that  do  not  cause  a branch  out 
of  the  normal  linear  order.  An  interpreter  for  this  new  DEL  must 
maintain  an  Internal  program  counter  that  is  updated  by  the  length  of 
the  current  instruction  during  each  cycle  of  interpretation.  However, 
program  representations  will  be  smaller  — and  should  be  faster  — 
than  those  of  the  previous  DEL,  assuming  that  main  store  is  suffi- 
ciently slower  than  micro  store. 
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3. 2. 1.2.  Tree  Forms: 

Tree  structures  are  used  by  many  compilers  as  an  Intermediate 
form  from  which  the  final,  executable  code  Is  generated.  Intuitively, 
ancestor  nodes  refer  to  operators  (non-terminals  In  the  source 
language  syntax),  while  leaf  nodes  refer  to  variables  (syntactic  ter- 
minals). The  operation  code  associated  with  a node  Is  combined  with 
two  or  tree  pointers  to  form  a unit  of  fixed,  uniform  size.  These 
units  constitute  the  phywlcal  realization  of  a tree  structure  within 
the  main  store  of  the  host  machine.  The  units  for  a binary  tree  DEL 
need  contain  only  two  pointers  In  a minimal  realization:  (1)  the 
address  of  the  unit  for  the  left  descendent  of  a node;  and  (2)  the 
address  of  the  unit  for  Its  right  descendent. 


Unit  Address 


Left  Descendent  Address  > (unit) 


Right  Descendent  Address  > (unit) 


DEL  Operation  Code 


Figure  4:  Typical  Binary  Tree  Unit 

The  left  and  right  descendants  of  an  ancestor  node  which  Is  associated 
with  a binary  operator  correspond  to  Its  left  and  right  operands, 
respectively.  Usually,  the  operators  In  a DEL  are  binary  If  a tree 


structure  form  Is  selected  — unary  operators  are  treated  as  degen- 
erate binary  operators,  with  null  right  descendant  pointers.  Some 
auxiliary  pointers  (usually  to  the  ancestor  of  a node)  may  be  included 
to  facilitate  tree  traversal,  however. 

Perhaps  the  most  widely  used  traversal  strategy  is  "depth  first, 
left  to  right  postorder"  — meaning  that  a node  is  executed  only  after 
both  its  left  and  right  descendants  have  been  evaluated.  Under  this 
rule,  successive  left  descendants  are  visited  until  a "left  value"  is 
computed,  then  the  right  descendant  is  visited  (Knuth  [18]).  Only 
after  both  the  left  and  right  values  of  a node  are  known  will  the  node 
Itself  be  visited.  Finding  the  unit  for  a successor  node  is  a simple 
matter,  at  least  when  traversing  downward.  Only  a primitive  load 
operation  is  required  at  the  micro  level  to  extract  the  address  of  the 
proper  descendant  unit,  so  DELs  based  on  a tree  form  are  easily  inter- 
preted by  a wide  range  of  microprogrammable  hosts. 

There  is  a significant  problem  with  the  obvious  implementation  of 
this  algorithm,  however:  the  interpreter  must  maintain  a stack  of 
pointers  to  nodes  that  have  been  visited,  but  not  yet  executed. 
Entries  in  this  stack  are  the  addresses  of  units  associated  with  non- 
terminal nodes  that  must  be  reexamined  after  computing  the  values  of 
lower  level  nodes.  Maintaining  this  stack  enlarges  the  interpreter 
state  and  complexity.  The  need  for  this  stack  can  be  eliminated,  at 
the  expense  of  DEL  program  space,  by  including  a "back  pointer"  in 
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each  unit  chat  Is  the  address  of  immediate  ancestor. 

One  potential  advantage  of  tree  DELs  is  that  they  are  easy  to 
modify  Incrementally  — l.e.,  a surrogate  can  be  made  to  reflect  small 
changes  at  the  source  level  without  a full  recompilation.  The  new 
subtree  produced  by  recompiling  only  the  affected  portion  of  the 
source  program.  This  usually  requires  that  program  control  transfer 
points  and  DEL  variables  be  identified  by  node  rather  than  address, 
and  may  also  necessitate  a run  time  "garbage  collector"  to  reclaim  the 
holes  left  by  excised  DEL  code. 

Another  potential  advantage  is  that  the  interpretation  of  a sub- 
tree can  be  bypassed  during  an  execution  if  either:  (1)  the  value 
computed  the  last  time  its  root  node  was  visited  is  retained  in  the 
root's  unit;  and  (2)  none  of  the  values  associated  with  the  leaves  of 
the  subtree  has  been  modified  since  the  root  was  last  visited.  In 
order  to  obtain  this  advantage,  though,  a complex  tagging  scheme  to 
mark  the  validity  of  the  values  stored  in  ancestor  units  may  be 
needed.  Unfortunately,  the  overhead  of  such  a tagging  scheme 
(incurred  each  time  a node  is  visited),  together  with  the  time 
required  to  store  the  last  computed  values,  may  be  greater  than  the 
time  saved  by  escaping  the  evaluation  of  some  subtrees.  It  is  not 
easy  to  evaluate  the  tradeoffs  involved,  though,  since  adequate 
statistics  are  not  easily  obtained.  This  strategy  at  least  offers  the 
possibility  that  tree  DELs  can  be  developed  which  are  effectively  more 


compact  and  more  efficient  to  Interpret  than  linear  DELs. 


3. 2. 1.3.  List  Forms: 

The  simplest  examples  of  linked  lists  look  much  like  unary  or 
binary  trees;  in  fact,  most  of  the  above  tree  related  comments  are 
equally  applicable  to  linked  list  DELs.  However,  the  links  within  a 
list  (its  nodes)  may  be  their  own  ancestors  — l.e.,  cycles  are 
allowed.  Again,  instructions  are  associated  with  the  links  in  a list 
representation.  They  contain  a pointer  to  a successor  link,  and 
either  an  atomic  value  or  a pointer  to  a value  link.  A unique 
pointer,  NIL  ("0"  in  Figure  3(c)),  is  used  as  the  successor  pointer  in 
such  terminal  links. 

This  classic  definition  is  easily  extended  to  cover  lists  in 
which  links  may  reference  multiple  successor  or  value  cells,  thus 
reducing  the  number  of  links  needed  to  represent  complicated  control 
and  data  structures.  Traversal  usually  proceeds  by  value  first,  then 
successor  — analogous  to  depth  first,  left  to  right  poslorder  tree 
traversal. 

Because  of  their  generality,  linked  lists  are  not  easily  address 
encoded.  While  the  relative  spatial  cost  of  link  pointers  depends  on 
the  average  size  of  a DEL  instruction;  a linked  list  DEL  almost  always 
requires  more  space  than  an  equivalent  linear  form  DEL,  barring  exten- 


sive factoring  of  common  sublists 


However,  the  marginal  cost  of 
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incorporating  additional  address  references  Is  low  for  a linked  list 
DEL  representation,  and  hence  It  Is  comparatively  easy  to  implement 
complex  operators  that  do  not  easily  fit  in  the  binary  operator  order. 

For  example,  the  target  of  any  branch  can  be  directly  encoded  as 
one  of  the  successor  pointers  In  its  link  unit,  and  need  not  be 
treated  as  an  indirect  operand.  This  is  not  always  possible  in  a tree 
DEL,  since  cycles  are  not  allowed.  The  flexibility  of  a linked  list 
form  can  also  be  exploited  by  linking  units  in  precisely  the  order  in 
which  they  should  be  interpreted  during  execution.  By  converting  the 
linked  list  in  Figure  3(c)  into  a polish  suffix  form,  for  example, 
backtracking  during  interpretation  could  be  eliminated.  This  reduces 
both  the  internal  state  size  and  complexity  of  the  interpreter,  but  is 
not  compatible  with  the  factoring  technique  described  above. 
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In  most  cases,  the  pointers  required  by  tree  and  list  structures 
makes  them  less  desirable  than  the  linear  array  as  a potential  DEL 
form:  both  because  of  the  space  these  pointers  occupy,  and  because  of 
the  extra  main  store  access  needed  to  determine  the  location  of  suc- 
cessor instructions.  It  is  usually  far  faster  to  increment  a DEL  pro- 
gram counter  (normally  maintained  in  a host  register)  than  to  fetch  an 
address  from  main  store.  Unless  the  flexibility  of  tree  and  list 
forms  can  be  exploited  in  an  innovative  manner,  the  spatial  and  tem- 
poral overhead  associated  with  this  single  negative  aspect  may  be  of 
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fj  3.2.2.  Sequencing  Within  an  Action:  Context 

I Defining  a sequence  rule  within  an  action  is  primarily  a problem 

I of  exploiting  execution  context  during  an  action  rule  interpretation. 

Context  Information  may  be  used  to  significantly  Improve  action  rule 
• representation  at  the  expense  of  some  additional  complexity  in  the 

I 

[ Interpretation  process.  We  consider  five  distinct  types  of  context. 

' 3. 2. 2.1.  No  Dependencies 

The  simplest  program  representations  involve  no  dependencies,  and 
an  example  of  such  DELs  is  "threaded  code"  — in  which  each  field 
occupies  a full  word  of  storage,  and  is  itself  a direct  pointer  to 
either  a cell  in  the  DEL  data  store  (operand  references)  or  to  a 
semantic  routine  in  micro  store  (operator  references).  This  straight 
forward  encoding  may  In  fact  be  optimal  if  the  host  has  little  or  no 
field  extraction  capability,  since  each  syllable  starts  on  a word 
boundary  and  need  not  be  processed  before  use  during  interpretation. 

Threaded  code  programs  are  similar  to  highly  subroutlnized  host 
programs  in  which  there  is  one  subroutine  for  each  semantic  routine 
within  the  threaded  code  interpreter.  However,  CALL  and  RETURN  opera- 
tors are  omitted  in  the  threaded  code,  which  reduces  its  program  store 
requirements;  the  Interpreter  performs  the  function  of  the  deleted 
operators.  Operands  are  usually  passed  as  in-line  vectors  of 
addresses,  and  operations  Indicated  by  explicit  micro  store  addresses. 
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though,  just  as  arguments  are  Imbedded  In  the  calling  sequences  of  a 
host  machine. 

The  time  needed  to  fetch  a threaded  code  instruction,  in  main 
memory  accesses,  is  k+1;  where  k is  the  average  number  of  operands  per 
instruction.  If  we  let  b denote  the  number  of  bits  per  word  of 
storage,  then  the  space  required  to  represent  a threaded  code  instruc- 
tion is  b * (k+1). 

3. 2. 2. 2.  Memory  Dependencies 

Given  a word  oriented  host,  we  view  instructions  as  fixed  length 
"records"  containing  a fixed  number  of  subfields  at  known  boundaries. 

In  this  case,  use  ordering  is  of  minimal  Importance,  since  the  syll- 
able positions  are  always  known.  Selecting  an  optimal  instruction  lay- 
out is  basically  an  alignment  problem;  instructions  should  be  stored 
on  bit  addresses  that  minimize  the  number  of  main  store  accesses 
required  to  extract  critical  fields.  This  problem  is  examined  from 
the  perspective  of  the  computer  architect  in  Flynn  and  Henderson  [7]. 

Their  analysis  can  be  applied  directly  to  the  DEL  synthesis  prob- 
lem, although  there  are  fewer  free  variables  in  this  case  since  the 
host  machine  is  an  assumed  given.  The  relevant  result  is  an  analytic 
expression  for  the  average  number  of  accesses  required  to  retrieve  a 
group  of  F characters  with  character  address  I into  a record  of  length 
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< Record  1 > < Record  2 
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Figure  5:  Accessing  KEY  Fields  in  DEL  Instructions 


The  group  of  F characters  can  be  thought  of  either  as  an  entire 
DEL  instruction  — in  which  case  the  notion  of  a record  also 
corresponds  to  an  instruction  — or  as  a critical  syllable  (e.g.,  the 
KEY  code)  within  an  instruction.  In  the  latter  case,  the  instruction 
is  itself  the  L character  record.  If  each  main  store  access  retrieves 
n characters  of  data,  the  number  of  accesses  needed  to  fetch  the  crit- 
ical portion  of  an  instruction  is 
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n/{£,n} 


where:  f = F Mod  n (least  positive  residue;  l.e.,  x Mod  x “ x) , 

“ L Mod  n (least  positive  residue);  1 » I Mod  {£,n}(least  resi- 
due, including  0),  and  {£,n}  = greatest  common  divisor  of  -£  and  n. 


Although  formidable  in  appearance,  this  equation  is  not  difficult 
to  interpret.  Clearly,  the  number  of  accesses  required  to  fetch  a DEL 
instruction  of  length  F from  a unit  of  length  L will  be  either 
or  depending  on  the  number  of  word  boundaries  crossed. 

This  is  determined  by  the  starting  address  of  the  instruction.  The 
second  term  is  an  analytical  representation  of  the  average  effect  of 
this  placement,  assuming  that  fields  occupy  integral  multiples  of  the 
basic  storage  quantum  (e.g.,  eight  bit  bytes  for  a 360/370  environ- 
ment). While  this  is  a reasonable  assumption  for  a machine  designer, 
character  size  is  often  a free  variable  to  the  DEL  designer  (Hoevel 
and  Wallach  [13]). 


If  the  host  is  strongly  biased  toward  a particular  character 
size,  then  it  is  probably  best  to  use  this  as  the  basic  storage  quan- 
tum for  DEL  encodings.  If  the  host  is  unbiased,  however,  the  size  of 
a character  should  be  selected  to  minimize  F/n.  The  Flynn-Henderson 
equation  shows  that  it  is  best  to  start  instructions  on  character 
addresses  that  are  integer  multiples  of  {£,n}.  In  this  case,  the  time 
needed  to  fetch  a typical  DEL  instruction,  in  main  storage  accesses. 
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Is : 

f - {^,n} 

Access  Time  = ~ 

n 

while  the  space  needed  to  represent  it  is: 

Program  Size  = Z * b/n  = w * (k+1)  bits 

As  above,  k is  the  number  of  syllables  that  must  be  fetched  and 
decoded  to  execute  the  entire  instruction,  and  b is  the  number  of  bits 
per  word;  w is  the  average  number  of  bits  per  syllable. 

In  most  cases  F is  less  than  n,  and  so  the  average  fetch  time  is 
minimal  when  F is  minimized  — i.e.,  when  pointers  and/or  instructions 
occupy  as  few  characters  as  possible.  Decoding  algorithms  for  this 
type  of  DEL  are  usually  straight  forward.  Since  instructions  are  word 
aligned,  the  exact  bit  offset  of  each  subfield  is  known,  and  decoding 
is  at  worst  a simple  combination  of  mask  and  shift  operations. 

In  some  cases,  special  features  of  the  host  can  be  exploited  — 
such  as  the  transform  board  capability  of  the  CDC  5600  series,  which 
allows  the  contents  of  a micro  register  to  be  "exploded"  (i.e.,  dis- 
tributed accross  several  other  micro  registers  in  a single  micro 

/ 

instruction).  This  board  must  be  physically  rewired  for  each  such 
explosion  desired,  however,  and  cannot  be  changed  dynamically  during 


an  emulation  (Control  Data  Corporation  [4]) 
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3. 2. 2. 3.  Inter  Instruction  Dependencies 

Both  the  sequence  in  which  Instructions  are  encountered  and  their 
placement  can  affect  their  interpretation  for  certain  DELs.  The  pri- 
mary reason  for  selecting  a form  with  inter  instruction  dependencies 
is  to  minimize  the  size  of  a typical  DEL  program,  and  thus  Indirectly 
reduce  the  average  fetch  overhead.  Since  a relatively  large  space 
penalty  is  usually  incurred  when  a tree  or  list  sequencing  rule  is 
used,  these  forms  are  most  often  applied  to  linearly  sequential  DELs. 

To  exploit  the  similarity  between  Integer  addressable  stores  and 
locally  sequential  program  structure,  a design  permitting  multiple  DEL 
Instructions  to  be  placed  in  a single  word  of  storage  must  be  devised. 
Minimizing  the  size  of  individual  DEL  instructions  is  quite  important 
here,  although  if  an  execution  time  advantage  is  to  be  realized  the 
encoding  must  be  simple  to  recognize  and  decode. 

Usually,  the  DEL  program  state  vector  is  augmented  so  that  the 
Interpreter  can  remember  unused,  but  previously  fetched  portion  of  the 
DEL  instruction  stream.  Specifically,  a residual  control  cell  called 
the  current  instruction  word  (IW)  is  needed.  This  word  contains  those 
bits  in  the  DEL  Instruction  stream  that  were  brought  into  host  storage 
registers  during  the  last  instruction  stream  access  to  main  store,  but 


which  have  not  been  decoded 
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This  type  of  dependency  is  most  effective  for  hosts  with  wide 
storage  resources  and  a large  ratio  between  main  and  micro  store 
bandwidths.  To  a first  approximation,  if  an  average  of  m instructions 
can  be  packed  into  a single  word,  the  time  needed  to  fetch  a given 
instruction  stream  may  be  reduced  by  a factor  of  m compared  to  a fully 
Independent  technique. 

Interpreters  for  instruction  stream  dependent  DELs  must  maintain 
at  least  two  elements  of  residual  control:  a DEL  program  counter 
(PC);  and  current  instruction  word  (IW).  If  full  prefetch  is  imple- 
mented, and  additional  resldii^  control  cell  is  needed  — a successor 
instruction  word  (SW).  The  Interpreter  attempts  to  maintain  the  next 
word  of  instruction  stream  bits  in  SW  (l.e.,  keep  SW  equal  to  the  con- 
tents of  the  successor  to  the  word  last  loaded  into  the  IW).  When  all 
of  the  bits  in  the  IW  have  been  decoded,  its  contents  are  replaced  by 
the  contents  of  SW,  the  PC  is  updated,  and  most  of  the  time  needed  to 
transfer  instruction  words  from  main  store  into  the  internal  resources 
of  the  host  to  be  overlapped,  but  this  implies  that  the  PC,  IW,  and  SW 
must  be  maintained  in  the  fastest  storage  resowrce  (l.e.,  host  regis- 
ters). Use  ordering  of  syllables  Is  Important  in  a strongly  context 
dependent  DEL,  since  such  a large  fraction  of  the  micro  level  storage 
resources  must  be  dedicated  to  maintaining  the  DEL  Instruction  stream. 


For  example,  decoding  an  operator  specification  prior  to  the 
specifications  of  its  operands  (as  in  the  natural  sequence  of 
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interpretation  for  the  360/370  architecture)  forces  the  interpreter  to 
store  the  operator  code  across  the  operand  fetch  portion  of  the 
interpretation  cycle.  This  both  lengthens  execution  time  and 
increases  interpreter  size.  Also,  Instructions  need  not  be  word 
aligned.  This  means  that  it  may  be  more  difficult  to  decode  the  syll- 
ables, since  it  can  no  longer  be  assumed  that  they  are  aligned  on 
specific  address  boundaries. 

If  the  host  has  a register  pair  shift  capability,  a K bit  inter- 
nal field  extraction  may  be  accomplished  by  register  pair  shifting  K 
bits  from  the  retained  instruction  stream  word  into  a previously 
cleared  index  register  (IX).  If  the  host  has  only  a single  word  shift 
capability,  then  both  a mask  and  shift  are  required.  Both  of  these 
techniques  are  illustrated  below. 


i 
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In  this  diagram,  lower  case  letters  denote  specific  codes  for  indivi- 
dual syllable  codes,  and  the  "mask"  is  zero  except  at  bit  positions 
occupied  by  the  syllable  code  being  extracted  (l.e.,  "a").  Although 

shift  direction  is  critical  in  the  register  pair  shift  technique,  it 
is  easy  to  develop  a mask  and  shift  strategy  for  hosts  posessing  only 
a single  left  circular  shift. 

3. 2. 2. 4.  Memory  Mapping  and  Word  Boundary  Dependencies 

For  the  moment,  assume  that  a DEL  instruction  consists  of  a 
sequence  of  as  yet  undifferentiated  syllables.  These  syllables  may  be 
of  a single,  uniform  width  (often  the  case  for  polish  DELs),  any  of  a 
fixed  number  of  different  widths,  or  even  of  dynamically  varying 
widths.  Consider  the  following  three  strategies  for  coping  with  these 
possibilities : 

1.  Dynamically  concatenate  successive  words  in  the  DEL  program 
store,  in  effect  creating  a "bit  stream"  memory. 

li.  Code  the  fact  that  the  next  n syllables  lie  within  the  current 
instruction  word  as  part  of  the  semantic  interpretation  of  the 
first  (or  last)  syllable  in  the  Instruction. 

ill.  Reserve  one  syllable  code  (usually  all  zeroes)  to  signify  "end 
of  instruction  word"  — i.e.,  that  the  current  instruction 

word  is  exhausted  (i.e.,  has  been  interpreted),  and  a new 
instruction  word  fetch  is  required. 

The  first  technique  is  used  in  the  Burroughs  S-language  implemen- 
tation for  the  B1700,  a defined  field  host  capable  of  accessing  arbi- 
trary sized  fields  at  bit  addresses.  By  packing  DEL  Instructions  at 
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the  bit  level  means  that  "every  bit  Is  fully  utilized",  and  "appears 
to  account  for  half  of  all  the  program  compaction  which  has  been  real- 
ized on  the  B1700"  (Wilner  [31]). 

There  can  be  a high  interpretation  timfe  penalty  associated  with 
frequency  encodings,  however,  since  several  sequential  levels  of 
decoding  may  be  required  to  correlate  a syllable  code  with  the  proper 
semantics.  Wilner  outlines  an  "SDL"  encoding  that  is  claimed  to 
obtain  most  of  the  compaction  resulting  from  Huffman's  code  [14], 
while  still  permitting  reasonable  decode  times.  The  resulting  polish 
i form  instructions  are  about  thirteen  bits  in  length  (averaged  over 

I 

both  operator  and  data  instructions),  and  require  a maximum  of  three 
stages  of  decode.  Wilner  estimates  that  a pure  Huffman  code  would  be 
fourteen  per  cent  slower  to  decode,  but  would  only  reduce  the  size  of 
a typical  surrogate  by  one  per  cent. 

These  time  estimates  may  be  unique  to  the  B1700  and  the  specific 
interpretation  algorithm  used  to  process  the  S-languages.  Although 
Wilner  claims  only  a 2.6  per  cent  slow  down  from  a straight  n-way 
binary  code  to  a 4-6-10  staged  encoding,  the  manner  in  which  this  is 
computed  is  not  clear.  It  may  be  that  little  or  no  retention  is  used 
by  S-language  interpreters,  or  that  instruction  fetch  time  is  Included 
in  the  computation  of  decode  time  — which  would  certainly  tend  to 
equalize  differences  between  various  techniques.  Decoding  SDL  codes 
on  an  EMMY  [24]  based  system  would  require  more  than  double  the  time 


needed  by  a simple  n-«ay  binary  code.  This  is  equivalent  to  more  than 
40  per  cent  of  a typical  Instruction  execution;  if  a pure  Huffman  code 
were  used,  this  factor  could  register  pair  again.  At  least  some 
direct  hardware  assistance  appears  to  be  necessary  for  this  technique 
to  achieve  high  performance. 

The  second  strategy  is  nothing  more  than  the  familiar  fixed  field 
organization  used  by  most  second  and  third  generation  "machine 
languages".  Once  the  first  few  bits  of  such  a DEL  instruction  have 
been  decoded,  the  exact  length  and  placement  of  all  the  subfields 
within  that  instruction  can  be  determined.  In  this  case,  the  Flynn- 
Henderson  equation  can  be  used  to  adjust  the  overall  length  of  the 
various  instruction  types  so  as  to  minimize  the  time  needed  to  fetch  a 
given  instruction  stream  — l.e.,  minimize  the  time  needed  to  access 
the  critical  fields  that  define  the  transformations  to  be  performed. 

An  interesting  variation  of  this  scheme  is  used  for  CRIL  [15],  in 
which  the  semantics  associated  with  the  operation  defined  by  an 
instruction  specify  whether  or  not  the  next  instruction  to  be  executed 
lies  within  the  same  word  of  storage  as  the  current  instruction.  In 
general,  the  successors  to  arithmetic  operations  lie  in  the  same  word, 
while  successors  to  conditional  branches  lie  in  the  storage  word  at 
the  next  higher  address  (assuming  the  branch  is  not  taken  — see  ICL 
[15]).  The  360/370  "fixed  format"  inner  form  results  in  an  average 
instruction  size  of  about  24  bits;  the  ICL  approach  reduces  this  to 
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1 


about  20  bits. 


while  maintaining  the 


same  relative  Instruction 


capability. 


The  last  technique  was  developed  Independently  during  the  syn- 
thesis of  DELtran  (Hoevel  [12]).  It  approximates  the  bit  stream  pack- 
ing capability  of  the  B1700,  but  requires  only  two  registers,  the 
Instruction  Index  IX  and  Instruction  word  IW,  and  Is  easily  Imple- 
mented on  hosts  with  flexible  memory  arrangements.  Each  DEL  Instruc- 
tion Is  treated  as  a string  of  syllables  that  Is  fetched  and  decoded 
as  follows : 

1.  A syllable  Is  extracted  from  the  IW  using  either  of  the  two 
methods  described  above. 

2.  If  Che  IW  Is  now  zero,  transfer  of  Che  next  word  In  Che  Instruc- 
tion scream  Into  the  IW  Is  Initiated. 

3.  The  appropriate  routine  Is  Invoked,  depending  on  the  contents  of 
the  IX,  and  execution  continues  with  step  one. 

Using  this  technique,  the  all  zeros  code  must  be  reserved  to  indicate 
that  the  current  instruction  word  has  been  exhausted,  which  Is  not 
true  for  the  SDL  bit  packing.  However,  the  zero  code  strategy  can  be 
Implemented  without  Increasing  Che  size  of  the  Interpreter  state 
(either  the  IW  or  IX  registers  may  be  tested  for  equality  vrith  zero 
after  extracting  a syllable),  and  a minimal  number  of  host  Instruc- 
tions are  Involved.  In  constrast,  a separate  bit  position  counter  Is 
required  to  properly  concatenate  successive  SDL  syllables  In  hosts 
like  the  QIMY  and  CDC  5600,  and  extra  host  Instructions  may  be  needed 
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if  the  host  Is  not  sufficiently  parallel. 

The  generation  algorithm  for  this  Is  to  simply  place  successive 
syllable  codes  Into  a word  until  the  next  code  does  not  fit  within 
that  word.  The  current  word  Is  then  filled  with  zeros,  and  the  pro- 
cess is  repeated  for  the  next  word  In  the  DEL  program  store. 

The  following  is  a simple  technique,  hinging  on  the  definition  of 
"fit",  that  can  save  some  execution  phase  time  and  space.  Suppose 
that  there  are  M bits  In  the  next  syllable  code  to  be  packed  into  a 
word  that  has  only  N bits  remaining,  where  M is  greater  than  N.  The 
first  N bits  of  this  syllable  can  be  packed  into  the  current  word  if 
its  M-N  trailing  bits  are  zero  — they  will  be  supplied  automatically 
by  the  algorithm  outlined  above.  This  results  in  individual  syllables 
being  logically,  if  not  physically,  contained  within  individual  pro- 
gram store  words,  but  permits  entire  instructions  to  cross  word  boun- 
daries. 

By  assigning  these  codes  such  that  frequently  occurlng  codes  have 

a greater  number  of  trailing  zeros,  the  beneficial  effects  of  this 

technique  should  be  significantly  improved.  The  information  capacity 

of  any  given  syllable  is  decreased  by  the  mandatory  "all  zeros"  code 

w 

only  if  there  are  exactly  2 other  alternatives  that  must  be  dis- 
tinguished by  its  content,  where  w is  the  bit  width  of  the  syllable. 
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Figure  7:  "Fitting"  Syllables  at  the  End  of  a Storage  Word 


Intuitively,  this  gains  some  of  the  spatial  advantage  of  Huffman 
like  codes  (at  word  boundaries)  for  the  simple  straight  binary  code, 
yet  permits  rapid  decode.  In  theory,  it  could  also  be  used  in  con- 
junction with  more  highly  encoded  forms  (either  SDL  or  pure  Huffman): 
the  relative  time  gain  would  be  smaller  since  decode  overhead  would 
dominate  the  instruction  fetch,  however;  and  the  space  gain  wowld  be 
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reduced  due  to  the  reservation  of  the  all  zeros  code.  Time  and  space 
estimates  for  this  form  are: 

Access  Time  » (k+1)  * (R*w/b  + shift (w)  + test) 

Program  Space  =•  w * (k+1) 

R,  k,  b,  and  w are  again  the  same  as  for  the  threaded  code  and  record 
oriented  code  cases;  "shift (x)"  is  the  number  of  host  Instructions 
required  to  extract  an  x bit  field;  and  "test"  is  the  number  of  host 
instructions  needed  to  check  for  the  all  zero  code  (which  should  be 
zero  in  a well  designed  DEL  host). 

3. 2. 2. 5.  Field  Dependencies 

So  far,  we  have  discussed  only  static  dependencies.  It  is  also 
possible  to  take  advantage  of  locality  by  dynamically  changing  the 
interpretation  of  specific  codes.  That  is,  the  semantics  associated 
with  special  DEL  operators  may  be  used  to  change  the  tables  used  by 
the  decode  routine  within  the  Interpreter.  While  this  generally 
requires  rather  sophistocated  compilation  techniques  (see  Foster  and 
Gonter  [9],  and  Sweet  [28]]),  it  may  be  possible  to  avoid  exhorbitant 
overhead  by  applying  this  stratagem  only  when  DEL  control  passes  from 
one  module  to  another. 

This  is  because  of  the  one-to-one  correspondence  between.  DEL 
modules  and  the  lexical  "scopes"  in  the  source  programs  from  which 
they  were  derived.  Fixing  the  size  of  an  operand  reference  upon  entry 
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Co  a DEL  module  can  result  in  dramatic  compression  of  program  size, 
and  should  be  considered  vrtien  synthesizing  a DEL  for  any  block  struc- 
tured source  language.  Applying  this  technique  to  operator  references 
is  more  difficult,  since  there  is  no  direct  semantic  correlation 
between  the  set  of  operators  applied  in  a module  and  the  definition 
Its  scope. 

Conceivably,  escape  codes  could  be  used  to  reduce  the  number  of 
bits  required  to  distinguish  between  individual  DEL  operators.  As  far 
as  the  Interpreter  is  concerned,  the  only  cost  of  such  conditional 
operator  codes  would  be  the  inclusion  of  distinct  operator  decode 
tables  for  each  escape  class.  Explicit  escape  codes  may  have  to  be 
Inserted  at  every  potential  target  of  an  unstructured  GOTO,  however, 
which  will  increase  both  the  time  and  space  required  during  execution. 

A similar  problem  is  encountered  when  generating  register 
oriented  DEL  surrogates,  where  the  values  of  individual  variables  must 
be  saved  before  executing  an  unstructured  GOTO,  and  restored  upon 
arrival  at  each  potential  target  of  a GOTO.  Discussion  of  the  flow 
analysis  techniques  required  to  improve  on  this  naive  strategy  is 
beyond  the  scope  of  this  work  (see  Geshke  [10],  Elson  and  Rake  [5], 
McKeeman  [22]  and  [23]).  Our  concern  is  with  the  underlying  structure 


and  form  of  a DEL. 


3.3.  The  Action  Rule 


As  mentioned  in  the  first  section,  the  action  rule  consists  of  a 
function  applied  over  a domain  of  arguments  that  produces  one  result. 
Thefe  are  two  considerations  in  synthesizing  an  action  rule:  format 
and  operation. 

The  synthesis  objectives  for  both  considerations  should  be  clear 
from  the  earlier  discussion  of  cannonic  interpretive  form: 

* Enough  formats  should  be  available  to  provide  transformational 
completeness ; 

* Each  HLL  operation  should  have  a corresponding  interpretation 
within  the  limits  of  Interpreter  size. 

3.3.1.  Formats 

In  order  to  recognize  and  Interpret  DEL  instructions,  the  inter- 
preter must  be  able  to  determine  the  size  and  meaning  of  at  least  the 
next  syllable  to  be  fetched  and  decoded.  The  leading  syllable  in  an 
instruction  usually  specifies  its  layout  and  interpretation;  l.e., 
defines  the  format  of  the  instruction. 

In  order  to  select  an  optimal  format  set  in  an  orderly  manner,  it 
is  necessary  to  first  construct  a universe  of  formats  that  at  least 

covers  the  combinatorial  bindings  found  in  traditional  zero,  one,  two, 
and  three  address  architectures.  For  the  moment,  we  need  only  distin- 
guish between  two  general  classes  of  operand  references:  explicit 
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reference,  which  appear  as  distinct  syllables  within  an  instruction; 
and  implicit  references,  which  are  defined  by  the  instruction's  format 
code. 

We  use  a three  letter  mnemonic  code  to  describe  associations  of 
Implicit  and  explicit  operands  with  at  most  two  arguments  and  one 
result  (binary  order).  The  first  letter  identifies  the  operand  to  be 
bound  to  the  left  argument  of  the  operator  (if  any);  the  second  letter 
identifies  the  operand  to  be  bound  to  the  right  argument  (if  any); 
while  the  third  letter  identifies  the  operand  to  be  bound  to  the 
result  (if  any).  Seven  letter  designations  are  sufficient  to  describe 
all  relevant  possibilities: 


1.  "S",  an  Implicit  specification  of  the  cell  just  above  the  top  of 
the  evaluation  stack  (value  denoted  by  ^). 

2.  "T",  an  implicit  specification  of  the  cell  that  was  the  top  of  the 
evaluation  stack  (value  denoted  by  ^). 

3.  "U",  an  implicit  specification  of  the  cell  just  below  the  top  of 
the  evaluation  stack  (value  denoted  by  ji). 

4.  "A",  the  first  explicit  operand  specification  appearing  in  an 

Instruction  (value  denoted  by  a^) . 

5.  "B",  the  second  explicit  operand  specification  appearing  in  an 

instruction  (value  denoted  by  b). 

6.  "C",  the  third  explicit  operand  specification  appearing  in  an 

Instruction  (value  denoted  by  c), 

7.  for  null,  meaning  "not  applicable"  — probably  due  to  low 
functional  order. 
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machines.  Consider  the  following  rules  for  eliminating  formats  that 
are  redundant  with  respect  to  our  notion  of  transformational  complete- 
ness cited  In  the  canonic  Interpretive  form  discussion. 

1.  Formats  violating  standard  LIFO  stack  accessing  conventions  are 
not  required  (this  would  eliminate  such  formats  as  UAB,  STU,  ABU, 
etc. ) . 

2.  Only  one  ordering  of  T and  U In  the  first  two  (argument)  positions 
is  needed — we  use  the  UT  ordering,  which  is  consistent  with  a left 
to  right,  depth  first  post  order  taversal  of  the  macro-tree 
representation  of  a program. 

3.  Formats  that  differ  only  by  a permutation  of  explicit  references 
are  equivalent  (e.g.,  ABC,  ACB,  BCA,  BAC,  CBA,  and  CAB  are  all 
equivalent;  we  choose  the  alphabetized  element,  ABC  in  this  case). 

4.  Formats  differing  only  by  a permutation  of  the  null  designator, 

in  the  first  two  (argument)  positions  are  equivalent — we  use 
formats  with  a leading  null. 

All  of  the  above  elimination  rules  can  be  applied  without  adversely 
affecting  either  the  compilation  or  execution  phase.  Using  these 
rules,  the  343  element  format  universe  suggested  by  our  combinatoric 
identification  rule  can  be  reduced  to  30  elements.  The  table  below 
lists  all  distinct  combinations  remaining  after  these  rules  have  been 
applied,  grouped  in  order  of  increasing  functional  order. 

The  branches  in  a macro  definition  tree  [5]  may  be  thought  of 
either  as  explicit  references  (if  connected  to  a leaf  node),  or  as 
implicit  references  (if  connected  to  an  ancestor  node).  This  estab- 
lishes a connection  between  format  structure  and  the  context  of  opera- 
tor nodes  in  a macro  definition  tree.  By  inspection,  at  least  one  of 
the  above  formats  is  directly  associated  with  each  possible 
configuration  of  an  ancestor  node. 


Table  of  Potential  Formats 


MNEMONIC 

TEMPLATE 

SEMANTICS 

STACK 

<0P> 

call  op 

S 

<OP> 

s : = op 

+ 1 

A 

<X>  <0P> 

X : = op 

T 

<OP> 

call  op(t) 

-1 

A 

<X>  <OP> 

call  op(x) 

TT 

<0P> 

t :=  op(t) 

AS 

<X>  <0P> 

s :=  op(x) 

+1 

TA 

<X>  <0P> 

X :=  op(t) 

-1 

AA 

<X>  <0P> 

X :?  op(x) 

AB 

<X>  <Y>  <OP> 

y :=  op(x) 

UT 

<0P> 

call  op(u,t) 

-2 

TT 

<0P> 

call  op(t,t) 

-1 

AT_ 

<X>  <0P> 

call  op(x,t) 

-1 

TA 

<X>  <0P> 

call  op(t,x) 

-1 

AA 

<X>  <OP> 

call  op(x,x) 

AB 

<X>  <Y>  <0P> 

call  op(x,y) 

UTU 

<0P> 

u :=  op(u,t) 

-1 

TTT 

<0P> 

t :=  op(t,t) 

UTA 

<X>  <0P> 

X :=  op(u,t) 

-2 

TTA 

<X>  <0P> 

X :=  op(t,t) 

-1 

TAA 

<X>  <0P> 

X :=  op(t,x) 

-1 

ATA 

<X>  <0P> 

X :=  op(x,t:) 

-1 

TAT 

<X>  <0P> 

t :=  op(t,x) 

AAS 

<X>  <0P> 

s :=  op(x,x) 

+ 1 

TAB 

<X>  <Y>  <0P> 

y :=  op(t,x) 

-1 

ATB 

<X>  <Y>  <0P> 

y :=  op(x,t) 

AAB 

<X>  <Y>  <0P> 

y :=  op(x,x) 

ABB 

<X>  <Y>  <0P> 

y :=  op(x,y) 

ABS 

<X>  <Y>  <0P> 

s :=  op(x,y) 

+ 1 

ABC 

<X>  <Y>  <Z>  <0P> 

z :=  op(x,y) 

J 
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While  we  have  reduced  the  spatial  requirements  of  multi-format 
DEL  structures  to  a practical  order  of  magnitude.  Implementing  all  30 
formats  listed  In  the  table  may  still  be  prohibitive  for  some  hosts. 
The  following  theorems  Identify  some  interesting  subsets  of  this  for- 
mat universe. 


Theorem  1:  The  canonic  Interpretive  form  requirements  can  be  satis- 
fied using  only  eleven  formats,  up  to  the  level  of  diadlc  opera- 
tors, If  "reverse"  forms  for  all  non-commutative  operators  are 
Included  In  the  set  of  action  functions. 

Proof;  Consider  the  following  DEL  restrictions  and  Interpreter  coding 
conventions. 

1.  Semantic  routines  for  monadic  operators  mwst  Increment  the 
pointer  to  the  top  of  the  DEL  evaluation  stack  before  perform- 
ing their  normal  processing. 

2.  "Reverse"  forms  for  all  non-commutative  (diadlc)  operators 
must  be  Included  in  the  repertoire  of  DEL  action  functions. 

Given  these  restrictions,  we  may  eliminate  all  format  codes 
whose  mnemonic  contains  the  "_"  by  using  the  binary  format  con- 
taining a "S",  "T",  or  "U"  in  the  same  position,  but  which  is  oth- 
erwise identical  (interpreter  convention).  Formats  differing  only 
by  a reversal  of  the  left  and  right  argument  binding  (e.g.,  ABA 
and  ABB)  are  redundant  under  the  DEL  restriction;  only  one  element 
of  each  such  pair  is  needed.  Finally,  no  format  whose  code  begins 
with  "TT"  can  be  generated  by  a naive  compiler,  since  this  would 
require  recognition  of  the  use  of  an  intermediate  value  as  a 
repeated  argument. 

The  set  (UTU,  UTA,  TAT,  TAA,  TAB,  AAS,  ABS,  AAA,  AAB,  ABA, 
ABC}  satisfies  the  theorem  by  inspection. 


Theorem  1 demonstrates  that  the  individual  advantages  of  both 
stack  and  register  oriented  architectures  can  be  merged  at  a gross 
cost  of  only  four  bits  per  instuctlon,  which  compares  favorably  with 
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typical  polish  DELs  (in  which  each  instruction  contains  two  form  bits 
to  distinguish  between  "push",  "pop",  "operate",  and  "literal").  For 
example,  a single  TAB  format  is  equivalent  to  the  polish  sequence 
"push  A,  operate,  pop  B";  the  first  requires  one  instruction  and  four 
format  bits,  the  second  requires  three  instructions  and  six  format 
bits . 

Determining  the  relative  advantage  of  a format  rich  DEL  over  a 
mono  format,  register  oriented  DEL  with  a variety  of  addressing  modes 
is  more  complicated.  Auto  Increment  and  decrement  capability  can  be 
used  to  simulate  a stack  architecture,  while  indexing  and  Indirecting 
can  be  used  to  simulate  memory  to  memory  oriented  architectures. 
Addressing  mode  flexibility  does  not  extend  to  exploiting  multiply 
used  operands,  however,  and  is  manifestly  not  as  compact  or  efficient 
as  an  implicit  stack  architecture  (it  is  difficult  to  perform  net 
adjustments  to  the  stack  pointer,  for  example).  Further,  as  will  be 
seen  in  the  next  section,  there  are  more  direct  operand  reference 
encodings  that  can  be  used  on  most  dynamic  hosts. 

Theorem  2:  Only  four  formats  are  required  if  the  DEL  evaluation  stack 
is  eliminated. 

Proof ; The  set  (AAA,  AAB,  ABA,  ABC}  is  sufficient,  by  inspection. 

Compilation  is  somewhat  more  difficult  in  this  case,  however, 
since  "dummy"  variables  must  be  synthesized  in  order  to  evaluate  com- 
pound expressions.  Although  fewer  bits  would  be  needed  to  Indicate 
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the  format  code,  it  is  likely  that  the  space  and  time  required  during 
execution  would  Increase  because  of  these  extra  explicit  operand  syll- 
ables. 

Theorem  3:  Only  six  formats  are  needed  to  satisfy  all  but  the  "unique 
variable"  requirement  of  the  canonic  interpretive  form. 

Proof ; The  set  {UTU,  UTA,  TAT,  TAB,  ABS,  ABC}  is  sufficient,  again  by 
Inspection. 

It  is  difficult  to  determine  whether  or  not  execution  phase  time 
and  space  would  increase  or  decrease  if  this  reduced  format  set  is 
used,  however,  since  the  question  is  sensitive  to  user  behavior.  The 
smaller  format  sets  are  interesting  because  of  their  coding  compati- 
bility with  hosts  strongly  biased  toward  8 bit  storage  quanta.  If 
only  two  or  three  bits  are  needed  to  define  the  format  of  an  instruc- 
tion, then  it  is  possible  to  combine  both  the  format  and  operator  code 
in  a single  byte. 

Any  of  the  above  format  sets  would  be  enhanced  by  the  addition  of 
special  formats  to  handle  reverse  forms  of  non  commutative  operators 
(e.g.,  ATT,  ATA,  ATB,  and  ABB),  or  of  auxilary  formats  to  simplify 
interface  processing  for  unary  operators  (e.g.,  TT,  TA,  AS,  AA,  and 
AB).  One  or  two  "escape"  formats  might  also  be  added  to  provide  a 
mechanism  for  implementing  higher  order  formats  (for  operators  with 
greater  than  binary  order),  user  defined  operators,  or  other  DEL 
extensions.  The  critical  point  is  that  these  format  sets  are  "rich" 
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enough  to  guarantee  that  no  non  functional,  memory  oriented  overhead 
Instructions  need  be  generate  or  evaluate  arithmetic  expressions: 
l.e.,  their  M-ratlo  Is  zero  by  construction, 

3.3.2.  Selecting  Operators 


Suppose  that  the  design  of  a DEL  Is  complete  except  for  the 

N 

selection  of  Its  operator  set;  and  further  that  a finite  set  F • 
of  potential  operators  Is  "well  known"  — In  the  sense  that  there  s a 
micro  expansion  x^  and  a macro  expansion  for  each  potential  opera- 
tor f.  (1*1,  ...»  N ).  Intuitively,  x Is  the  body  of  a host  routine 
that  Implements  the  semantics  of  f^^,  while  Is  constructed  entirely 
from  operators  In  the  set  — and  so  could  be  generated  In  place 

of  f^  should  It  not  be  selected  as  a DEL  operator.  The  problem  Is  to 
find  a subset  G of  F that  minimizes  the  space  and  time  requirements  of 
the  resulting  DEL. 


Let  w^  be  the  number  of  micro  store  words  required  by  x^,  and  W 
be  the  total  number  of  words  of  micro  store  that  can  be  used  to  hold 
semantic  routines.  The  difficulty  Is  that  w^  +W2  + ...  + ®ay  be 
greater  than  the  number  of  available  words  of  micro  store,  so  that  It 
Is  not  possible  to  simply  set  G equal  to  F.  Let: 

d^  “ the  dynamic  frequency  of  f^; 

t^  “ the  average  time  needed  to  execute  x^; 

T^  “ the  average  time  needed  to  execute  X^; 

Sj^  “ the  static  frequency  of  f ; 

1^  “ the  length  of  the  Identifier  for  f ; 

Lj  " the  length  of  X^ ; 
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and  for  any  subset  Z of  F,  define: 

t(Z)  - d *t  + ...  + d *t  ; 

T(Z)  - d|*T|  + ...  + d"*T"; 
s(Z)  - s,*l,  + ...  + s *l"; 

S(Z)  - 8[*lJ  + ...  + s"*l"; 

w(Z)  “ tne  sum  of  all  w such  that  f,  is  an  element  of  G;  and 
E(Z)  = -(  (A*(t(G)  + T(f-G))  + B*(s(G)  + S(G))  ). 


The  Intent  is  to  quantify  the  notion  of  efficiency  by  a linear  func- 
tion, E — which  implies  that  the  marginal  utility  of  micro  store  is 
constant.  This  is  a reasonable  approximation  for  small  changes  in  the 
DEL  operator  set,  since  only  a small  fraction  of  the  total  space 
available  would  be  affected.  The  objective  is  now  to  find  a set  G 
that  maximizes  E,  subject  to  the  constraint  w(G)  < W.  To  this  end, 
define  the  merit  of  selecting  operator  f^  (l.e.,  the  incremental 
advantage  of  placing  semantic  routine  x^  in  micro  store)  to  be: 

m^  - A*(d^*(T^-t^))  + B*(s^*(L^-lj)) 

Further,  let  the  merit  m(^)  of  any  subset  Z of  F be  the  sum  of  the 
individual  merits  m^  for  all  1 such  that  f^  is  an  element  of  Z.  It 
can  be  assumed  without  loss  of  generality  that  the  elements  of  F are 
ordered  such  that  1 < j implies  either  m^/w^  > “l^'^i  ” '"j^'^j 
and  w^  ^ '^j'  claim  is  that  this  defines  a natural  lifeboat  order- 
ing for  F;  as  reflected  by: 


Theorem  4:  If  G is  the  subset  {f,»  f,*  •••»  f)  of  F such  that 
w(G)  < W < w(G)  + w ,,  then  " 

W(H)  <"W  ->  E(H)  - E(G)  < m *(W-w(G)) 

for  any  subset  H of  F. 


Proof:  Let  H be  any  subset  of  F satisfying  the  hypothesis.  If  GH 
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denotes  the  Intersection  of  G in^nd  H,  then  w(H- 
GH)  < W - w(G)  + w(G-GH)  by  definition  . Now,  ®,/w.  < m /w  for 
all  j such  that  f is  in  H-CH  since  j must  be  greater^  than  n by 
construction;  this  means  that: 

1)  m(H-GH)  < (m  /w  )*w(H-GH)  < (m  /w  )*(W-w(G)+w(G-GH)) 

n n n n 

since  w > 0 for  all  j.  But  (m  /w  )*w(G-GH)  < m(G-GH),  again  by 
construction;  this  means 

2)  m(H-GH)  - m(G-GH)  < (m  /w  )*(W-w(G)) 

n n 

Since  m(Z)  « E(Z)  + A*T(F)  + B*S(F)  for  any  subset  Z of  F, 

3)  E(H)  - E(G)  < (m  /w  )*(W-w(G))  qed. 

n n 

The  difference  in  efficiency  between  an  optimal  DEL  and  that 
resulting  from  an  application  of  Theorem  4 must  be  less  than  a com- 
paratively small  factor  (m  /w  ) times  the  unused  micro  store  (W-w(G)). 

n n 

The  product  should  be  quite  small  in  comparison  to  the  overall  effi- 
ciency rating  of  the  DEL  — both  because  W-w(G)  is  small  in  comparison 

to  w(G),  and  because  m /w  may  be  no  greater  than  m. /w.  for  all  i < n. 

n n 11 

The  practical  simplification  is  that  it  is  no  longer  necessary  to 
formulate  and  solve  a general  linear  programming  problem  in  order  to 
select  an  efficient  operator  set.  The  question  of  how  F is  deter- 
mined, however,  remains  open.  In  many  cases  it  is  probably  sufficient 
to  set  F equal  to  the  set  of  all  functions  used  in  the  semantic 
specification  of  the  given  source  language.  If  the  highest  perfor- 
mance is  to  be  achieved,  hovever;  additional  operators  are  likely  to 

^^w(X-XY)  “ w(X)  - w(XY)  for  any  subsets  X and  Y of  F. 


57 


be  needed.  The  following  principles  may  be  useful;  let  be  a prel- 
iminary set  of  operators  derived  by  Inspection  of  the  source  language 
semantics: 


1)  Set  Fq  equal  to  the  set  of  primitive  functions  extracted  by 
Inspection  of  the  semantic  specification  for  the  given  source 
language. 

2)  Form  F.,  the  closure  of  F under  n-ary  composition  (n  » 1-3  should 
be  sufficient  In  light  of^Knuth's  statistics  [17]). 

3)  Form  F^  by  Including  natural  decompositions  for  complex  functions 
(e.g.,  extracting  "normalize"  and  "unnormalized  multiply"  opera- 
tors from  a standard  "floating  multiply"). 

4)  Form  F^  by  Including  special  operators  for  frequent  bindings  of 
operators  in  F^  to  literal  arguments  (e.g.,  adding  a unary  "INC" 
operator  to  replace  "_+l"),  and  again  taking  closure. 


In  general,  it  Is  Important  to  exploit  Implicit  specification  of 
functions  or  arguments  whenever  possible  — a typical  example  being 
the  automatic  invocation  of  a "standard  fix-up"  after  arithmetic  over- 
flow or  underflow.  This  is  especially  true  of  program  control  and 
data  conversion/selection  operators.  For  example,  if  the  source 
language  Is  strongly  structured^ \ then  it  may  be  possible  to  keep  a 
stack  of  pertinent  variables,  addresses,  etc.,  within  micro  store  to 
speed  up  the  execution  of  looping  constructs  and/or  recursive  pro- 
cedure invocation. 


11 


I.e.,  all  control  structures  are  strictly  one-in  one-out 


As  a case  In  point,  consider  a generalized  ENDO  operator  that 
controls  termination  of  FORTRAN  DO-loops.  This  operator  requires  four 
operands:  an  Iteration  count  variable  (J);  an  Increment  value  (I);  a 
maximum  count  (M) ; and  a loop  transfer  label  (L).  The  expansion  for  a 
typical  loop,  "DO  10  might  be: 

MOVE  <N>  <J> 

L {body  of  loop} 

ENDO  <J>  <I>  <M>  <L> 


In  this  Implementation,  the  Iteration  count  variable  Is  explicitly 
Initialized  prior  to  loop  entry.  The  ENDO  operator  must  bind  the 
Identifiers  <J>,  <I>,  and  <M>  to  the  appropriate  values;  Increment  J 
by  I;  and  compare  the  result  to  M,  performing  the  appropriate  data- 
dependent  branch  for  each  Iteration.  There  Is  no  way  to  avoid  the 
Initialization  data-dependent  branch  steps,  but  If  there  are  no  expli- 
cit transfers  In  or  out  of  the  loop  body,  special  Initialization  and 
termination  operators  could  be  used: 

INITDO  <N>  <J>  <I>  <M> 

L {body  of  loop} 

ENDX 


In  this  case,  the  INITDO  operator  would  temporarily  move  the  values  of 
J,  I,  and  M Into  micro  store.  Initializing  J In  the  process.  The 
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loop-back  address  would  also  be  automatically  Initialized  at  this 
point.  I?.e  BIDX  operator  need  not  repeatedly  fetch,  decode,  and  bind 
the  identifiers  for  J,  I,  M,  and  L to  their  respective  values.  This 
saves  four  field  extractions  and  four  variable  accesses  per  iteration 
(the  value  of  J must  be  both  loaded  and  stored). 

3.4.  Process  Name  Space— Oeneral  Issues 

A name  used  by  a process  is  a surrogate  for  a value.  The  set  of 
all  names  that  can  be  accessed  by  a process  is  the  name  space  for  that 
process.  Source  level  names  are  usually  just  alphanumeric  strings 
Imbedded  within  a program  text;  DEL  leve]  names  are  operand  identif- 
iers appearing  within  executable  Irstruc  ons  (usually  in  1-1 
correspondence  with  source  names);  and  host  level  names  are  simply 
addresses  of  accessable  elements  of  the  host  storage  hierarchy. 
Values  are  associated  with  names  via  a "contents  map" — at  any  point 
during  a computation,  the  contents  of  a name  is  its  correct  value.  In 
this  discussion,  we  are  concerned  only  with  the  properties  of  names 
themselves,  not  with  the  form  of  identifiers  for  these  names  or  the 
problem  of  interpreting  Identifiers  within  an  executable  instruction; 
the  contents  mapping  is  assumed  to  be  established  externally — e.g.,  by 
a loader. 

Some  issues  related  to  the  concept  of  a process  name  space  are: 
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1.  range  and  resolution  of  objects, 

2.  range  extension — I/O  handling  and  files, 

3.  homogeneity  of  the  space, 

4.  reference  coding. 

Range  and  Resolution: 

Range  and  resolution  refer  to  the  maximum  number  of  objects  that 
can  be  specified  in  a process  space  and  the  minimum  size  of  an  object 
in  that  name  space  respectively.  Traditionally,  instructions  provide 
resolution  usually  no  smaller  than  an  8 bit  byte,  and  frequently  a 16 
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bit  or  larger  word,  and  range  defined  as  large  as  one  can  comfortably 

accomodate  within  the  bounds  of  a reasonable  Instruction  size  and 

16  2 ^ 

hence  program  size.  Thus,  ranges  from  2 for  minicomputers  to  2 
for  System  360  Include  most  common  arrangements. 

Range  Extension: 

The  range  of  the  name  space  directly  accessable  to  a host  Is 

bounded,  so  It  Is  essential  that  an  extension  mechanism  be  provided  to 

allow  a process  to  access  large  data  bases  (e.g. , I/O  and  file  han- 
dling). If  the  directly  accessable  range  were  unlimited,  then  as  soon 
as  objects  were  entered  anywhere  In  the  system,  the  place  of  entry  In 

the  processor  name  space  could  be  regarded  as  an  element  In  the  pro- 

cess name  space. 

An  associated  problem  Is  that  of  attaching  records  to  an  esta- 
blished process  name  space.  Usually  this  attachment  must  be  done  by  a 
physical  movement  of  data  from  Its  present  location  to  an  area  within 
the  bounds  of  the  present  process  name  space  before  It  can  be  operated 
on.  The  programmer  must  manage  data  movement  from  the  I/O  space  Into 
the  process  name  space  through  I/O  commands.  This  binding  or  attach- 
ment Is  the  responsibility  of  the  programmer  and  must  be  performed  at 
the  correct  sequential  Interval  so  as  to  Insure  the  Integrity  of  the 
data  and  yet  not  exceed  the  range  limitations  of  the  name 
space — overflow  buffers,  for  example.  Ability  to  communicate  between 
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an  unbounded  I/O  media  and  a bounded  processor  name  space  allows  the 
programmer  to  simulate  an  open  ended  name  space. 

I It  Is,  however,  an  uncomfortable  requirement  placed  on  the  pro- 

y grammer,  and  frequently  results  In  cumbersome  and  Inefficient  opera- 

R tlons.  Of  course,  the  larger  the  range,  the  more  precise  and  variable 

i! 

I the  resolution,  the  easier  It  Is  to  manage  objects  In  the  process  name 

space;  flexibility  In  this  regard  both  permits  and  promotes  concise- 
ness during  program  development. 

i: 

i!  OBSERVATION ; From  the  above,  the  desirability  of  an  unbounded  name 

it  space  with  flexible  attachment  possibilities  Is  clear. 

I Homogeneity: 

While  name  spaces  may  be  partitioned  In  many  different  ways, 

t ■ 

( homogeneity  refers  to  partitions  distinguished  by  the  action  rule  of  a 

i 

process.  Action  rules  or  instructions  generally  cannot  treat  all 
objects  In  the  same  way.  Certain  classes  of  objects  are  established 
such  as  registers,  accumulators,  and  memory  objects.  Action  rules  are 

■ t 

applied  in  a non-symetric  way:  one  of  the  arguments  for  an  action 
I rule  must  be  a register  whereas  the  other  may  be  a register  or  a 

memory  object.  The  premise  of  this  partitioning  Is  performance,  l.e. 
the  assumption  that  access  to  registers  Is  faster  than  access  to 
I memory.  Thus,  many  familiar  machines  have  their  name  space  parti- 

' tloned  into  a register  space  and  memory  space:  360,  PDP-11,  etc.  As 


« the  partitioning  of  the  name  space  Increases,  Its  homogeneity 
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decreases. 

References : 

Mapping  Identifiers  Into  their  Image  In  the  host  name 
space — l.e.,  determining  the  actual  location  or  address  of  a named 
object — Involves  a subtle  series  of  design  Issues.  There  Is  a broad 
spectrum  of  potential  tradeoffs  between  Interpretation  time  and  pro- 
gram representation  size.  Traditional  Issues  In  Identifier  construc- 
tion Include:  short  vs.  long  addresses.  Indexing;  Indirection; 

dynamic  tagging;  etc. 

The  reference  problem  may  be  broken  down  Into  two  parts, 
referencing  operands  and  referencing  operators.  Operand  referencing 
Involves  extracting  or  updating  the  value  of  an  object,  while  operator 
referencing  Involves  the  Invocation  of  an  action  rule  (l.e.,  process 
state  transformation). 

3. 4. 1.  Marne  Space  Synthesis  i 

Providing  a flexible  and  effective  name  space  structure  helps  j 

minimize  the  space  and  time  requirements  of  a DEL.  Good  designs  are 
characterized  by  both  a simple  correspondence  between  the  source  name 
space  and  the  DEL  name  space  (to  simplify  compilation  and  preserve 
transparency) , and  a simple  correspondence  between  the  DEL  name  space 

i 

and  the  host  name  space  (to  maintain  efficiency  during  execution).  j 
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High  level  language  name  spaces  generally  Involve  effectively 
unbounded  ranges,  one  dimensional  reference  structures  (viewing  sub- 
scripted arrays  and  other  qualified  references  as  "expressions"  rather 
than  primitive  symbols),  and  discrete  granularity  (i.e.,  reference 
structure  does  not  Induce  a fixed  relation  between  referands  in  the 
memory  space).  The  identifiers  used  as  references  at  this  level  are 
syntatlcally  homogeneous,  but  semantically  inhomogeneous — i.e., 
interpretation  of  the  contents  map  for  a referand  depends  on  the  con- 
text in  which  its  reference  appears.  In  particular  the  referand  asso- 
ciated with  a particular  source  name  may  be  different  for  different 
occurrences  of  that  name. 

This  is  because  the  name  space  of  most  source  programs  is  parti- 
tioned into  distinct  scopes  of  definition  (or  "scope"  for  short; 
intuitively,  a scope  is  simply  a natural  grouping  of  references  within 
which  the  association  between  references  and  referands  is  fixed, 
unless  altered  explicitly  by  dynamic  allocation  or  redefinition  state- 
ments) . 

ii 

I i 

, j On  the  other  hand,  most  host  level  name  spaces  are  structurally 

f 

I i 

|j  Inhomogeneous,  being  partitioned  into  register  sets,  storage  modules, 

i; 

li  etc.  References  to  elements  in  these  partitions  are  rarely  inter- 

changeable within  a host  instruction.  The  association  between  refer- 
ences and  referands  is  usually  fixed  at  this  level,  however,  even 


though  it  may  be  parameterized  in  terms  of  the  current  contents  map 


(e.g.,  as  In  Indexed  or  Indirect  referencing).  Such  discrepancies 
between  the  source  and  host  name  spaces  account  for  much  of  the  diffi- 
culty In  synthesizing  an  effective  DEL  name  space. 

DEL  organizations  may  be  classified  according  to  the  placement  of 
different  portions  of  the  information  needed  to  bind  a reference  to  a 
referand  (Chevance  [3]).  Data  Is  characterized  by  three  distinct 
pieces  of  Information:  type,  locator,  and  value.  The  type  of  a 

referand  defines  the  range  of  values  It  may  assume;  Its  locator 
defines  the  address  to  be  used  when  accessing  Its  contents;  and  Its 
value  Is  the  bit  pattern  assigned  by  the  current  contents  map,  which 
must  be  interpreted  according  to  Its  data  type. 

The  type  and  locator  may  be  specified  either  in  the  operation 
code  of  an  instruction  or  in  operand  reference  codes,  either  directly 
or  indirectly  (e.g.,  through  a display  vector).  Four  such  combina- 
tions are: 

1.  Type  in  operation  code,  locator  in  one  dimensional  reference  (con- 
ventional machine  languages). 

2.  Type  and  locator  concatenated  In  two  dimensional  reference  (this 
form  is  typical  of  higher  level  DELs — e.g.,  Weber  [29],  Wllner 
[31],  and  Wortman  [32]). 

3.  Type  and  locator  concatenated  In  a "descriptor"  identified 
Indirectly  through  a one  dimensional  reference  (descriptor  based 
machines) . 

4.  Locator  is  reference  Indlrected  Individually  through  a two  dimen- 
sional reference  (theoretical,  no  known  example). 
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The  traditional  approach  Is  to  partition  the  DEL  name  space  along  the 
same  lines  as  the  host  name  space,  mapping  symbolic  names  into  dis- 
tinct indexed  (two  or  three  dimensional)  references;  l.e.,  a type  1, 
2,  or  4 organization.  The  compiler  must  Insure  that  the  proper  base 
address  is  loaded  into  an  appropriate  index  register  when  the 
translated  references  are  evaluated.  This  increases  the  M-ratlo  of 
the  resulting  dynamic  Instruction  stream  by  requiring  significant 
load/store  activity  to  maintain  correct  base  register  values.  For 
example,  the  statement  "I  - J - I"  might  expand  to: 

L Rl,  @I 

L R2,  @J 

L R2,  0(R2) 

S R2,  0(R1) 

ST  R2,  0(R1) 

using  a 360/370  machine  language  DEL.  Only  the  subtract  instruction 
is  functional;  the  first  and  second  instructions  are  overhead  caused 
by  the  range  differential  between  source  and  DEL  name  space,  while  the 
third  and  fifth  instructions  are  memory  oriented  overhead  caused  by  a 
combination  of  the  Inhomogenelty  of  the  DEL  name  space  (storage  and 
register  references  no  Interchangeable)  and  combinatorial  restrictions 
of  the  360  architecture  (it  has  no  ABB  format).  This  approach 
emphasizes  the  Importance  of  register  allocation,  and  leads  to  ela- 
borate multi  pass  algorithms  for  minimizing  load/store  activity  (Sethi 
[25]  and  Stockausen  [27]). 


67 


k 


Incorporating  locator  Information  In  the  reference  Itself  also 
leads  to  complications  in  handling  the  thorny  problems  associated  with 
changes  in  scope  (e.g.,  storage  management,  passing  parameters,  and 
accessing  externally  defined  referands);  none  of  the  above  forms 
solves  this  problem  by  construction.  Perhaps  the  best  known  model  for 
describing  the  effects  of  scope  is  the  Contour  Model  developed  in 
Johnson  [16].  This  model  is  rich  enough  to  describe  the  address  map 
transformations  required  by  the  allocation,  release,  and  retention 
rules  of  most  sowrce  languages,  and  captures  all  practical  methods  of 
binding  actual  arguments  to  formal  parameters  as  well.  Its  guarantee 
of  completeness  suggests  the  Contour  Model  as  a good  design  base  for 
DEL  name  spaces. 

A process  is  defined  to  be  a time  Invariant  algorithm  together 
with  a time  varying  record  of  execution.  Discrete  points  in  an  execu- 
tion record  are  Identified  by  an  encoded  pair,  formal  parameters  in  a 
different  manner  than  local  variables,  however,  either  by  Including 
explicit  operators  in  the  DEL  Instuction  stream  (McClure),  or  always 
testing  for  indirection  (Wilner) — Bashkow  avoids  the  problem  by  res- 
tricting his  source  language  to  a subset  that  does  not  include 
subroutine  blocks  or  arrays. 
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3.4.2.  Environment  and  Contours 


The  notion  of  environment  Is  fundamental  not  only  to  DELs  but 
also  to  traditional  machine  languages  as  evidenced  by  widespread  adop- 
tion of  cache  and  virtual  memory  concepts.  What  is  proposed  here  is 
akin  in  some  respects  to  the  cache  concept  and  yet  quite  distinct  from 
it.  We  recognize  locality  as  an  important  property  of  a program  name 
space  and  handle  it  explicitly  under  Interpreter  control.  Thus, 
locality  Is  transparent  to  the  DEL  name  space  but  recognized  and 
managed  by  the  interpreter.  Properties  of  the  environment  are: 

1.  The  DEL  name  space  is  homogeneous  and  uniform  with  an  a priori 

unbounded  range  and  variable  resolution. 

2.  Operations,  involving  for  example  the  composition  of  addresses 

which  use  registers,  should  not  be  present  in  the  DEL  code  but 
should  be  part  of  the  interpreter  code  only.  Thus,  the  register 

name  space  and  the  Interpreter  name  space  are  largely  not  part  of 
the  DEL  name  space  and  it  is  the  function  of  the  interpreter  to 
optimize  register  allocation. 

3.  The  environment  locality  will  be  defined  by  the  higher  level 

language  for  which  this  representation  is  created.  In  FORTRAN, 
for  example,  it  would  correspond  to  function  or  subroutine  scope. 

4.  Unique  to  every  environment  is  a scope  which  Includes: 

I.  a label  contour, 

II.  an  operand  contour, 
ill.  an  operation  table. 


Following  the  Johnston  model,  we  define  a contour  to  be  a vector 
(or  table)  of  object  descriptors.  When  an  environment  is  invoked,  a 
contour  of  label  and  variable  addresses  must  be  established  (if  not 
already  present)  in  the  interpretive  storage.  For  a simple  static 


language  like  FORTRAN  this  creation  can  be  done  at  load  time 


For 


languages  that  allow  recursion,  etc,,  the  creation  of  the  contour 
wowld  be  done  before  entering  a new  environment.  An  entry  In  the  con- 
tour consists  of  the  (main  memory)  address  of  the  variable  to  be  used; 
this  Is  the  full  and  complete  DEL  name  space  address.  Type  Informa- 
tion and  other  descriptive  details  may  also  be  Included  as  part  of  the 
entry. 


The  environment  must  provide  a pointer  Into  the  current  contour, 
and  must  define  the  width  of  Identifiers  for  labels  and  variables. 
Typically,  the  contour  pointer  and  Identifier  width  would  be  main- 
tained In  the  register  of  the  host  machine.  We  denote  Identifier 
width  by  W and  the  pointer  to  the  base  of  the  current  contour  by  EP; 
Figure  9 Illustrates  the  process  of  referencing  a DEL  entity  using 
this  terminology.  Both  labels  and  variables  may  be  Indexed  off  the 
same  environmental  pointer.  Subfields  within  DEL  Instructions,  then, 
are  actually  containers  for  Immediate  values  that  define  Indices  In 
the  current  contour;  contour  entries  at  the  Indexed  location  define 
the  mapped  address  of  the  desired  variable  or  label  In  the  host  name 
space.  In  other  words,  the  operand  Identifiers  within  DEL  Instructions 
are  simply  contour  Indices  that  select  a particular  description  for 
the  Image  of  a given  source  level  object  In  the  host  name  space. 

The  Contour  Model  differs  from  other  high  level  DEL  architectures 
In  that  the  function  of  references  Is  separated  from  that  of  descrip- 
tors, Refererces  are  one  dimensional  Indices  Into  a current 


F and  W identify  A 


DEL  instruction 
environment 


Figure  9:  Referencing  a DEL  Variable 

declaration  array,  which  we  call  the  current  contour.  The  current 

contour  is  always  maintained  within  the  host  micro  store,  and  a new 

contour  is  created  for  each  distinct  Incarnation  of  a source  scope. 

This  is  an  extreme  case  of  a type  2 organization,  in  which  only  W bits 

are  used  to  represent  a reference — where  W is  the  smallest  Integer 

W 

such  that  there  are  less  than  2 distinct  referands  in  the  current 


access  environment 
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Each  contour  Is  uniquely  identified  by  an  environment  pointer 
that,  at  least  logically,  denotes  Its  zero^^  element.  The  environment 
pointer  for  the  current  contour  Is  part  of  the  DEL  program  state  vec- 
tor, and  must  be  saved/restored  when  entering/leaving  a scope  of 
definition.  The  address  map  Is  computed  by  adding  the  reference  code 
to  the  current  environment  pointer,  and  then  accessing  the  appropriate 
referand  descriptor  (Figure  10); 


descriptor  ( reference  N ) ••  micro  store  ( ep  + N ) 
value  ( reference  N ) « main  store  ( descriptor  M ) 

Figure  10:  Normal  DEL  Addressing  Structure 

This  analysis  can  be  extended  by  noting  that  the  logical  type  of  a 
referand  (Integer,  floating  point,  logical,  or  character)  can  be 
separated  from  Its  physical  type  (single,  double  or  varying  percl- 
slon).  We  refer  to  the  physical  type  as  "shape".  Elements  of  con- 
tours are  descriptors,  each  of  which  Is  Itself  a vector  that  defines 
the  shape,  type,  and  locator  of  a partlclar  DEL  entity — or,  more  pre- 
cisely, the  algorithm  used  to  access  that  entity.  Distinguishing 
shape  within  the  descriptor  allows  us  to  use  semantic  routines 
designed  for  the  general  case,  rather  than  having  one  per  type:shape 


combination 
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It  is  important  that  descriptor  processing  be  kept  as  simple  as 
possible.  For  most  languages,  this  means  that  the  value  of  the  vari- 
able will  be  located  in  the  main  store  cell  whose  address  is  defined 
by  the  appropriate  descriptor — e.g.,  the  value  of  the  n-th  DEL  vari- 
able is  located  in  the  memory  cell(s),  whose  initial  address  is  given 
by  the  contents  of  the  n-th  word  of  the  contour  in  micro  store.  If 
this  is  done,  then  the  effective  address  of  a referand  can  be  calcu- 
lated in  two  basic  host  cycles  using  our  method  (micro  store  is 
ass".med  to  have  an  access  time  comparable  with  the  time  needed  to  per- 
form a primitive  arithmetic  operation).  Essentially,  dynamic  contours 
are  a simple  mechanism  for  exploiting  the  writability  of  modern  micro 
stores;  in  effect  we  have  created  a distinct  "base  register"  for  each 
distinct  DEL  entity  rather  than  for  contiguous  blocks  of  entitles. 

If  the  source  language  has  the  property  that  two  distinct  source 
names  can  never  denote  the  same  referand,  then  the  indirection  step 
may  be  avoided  by  maintaining  values  of  (scalar)  DEL  variables 
directly  in  the  contour.  This  is  not  usually  the  case,  however;  due 
either  to  "overlay"  capability  (e.g.,  the  EQUIVALENCE  feature  in  FOR- 
TRAN, or  pointer  references  in  PASCAL),  or  to  the  possibility  of  bind- 
ing the  same  actual  argument  to  two  distinct  formal  parameters  using 
"call  by  reference"  or  "call  by  name". 


Given  a fully  static  source  language  (like  BASIC,  FORTRAN,  or 
PASCAL)  a unique  contour  for  each  distinct  scope  of  definition  may  be 
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preallocated  during  compilation.  In  this  case,  only  the  descriptors 
for  formal  parameters  need  be  modified  during  execution.  For  most 
source  languages,  however,  a new  contour  will  have  to  be  created  each 
time  a new  scope  is  entered;  particularly  If  the  source  language  sup- 
ports recursive  procedure  Invocation.  In  this  case,  a highly  encoded 
header  could  be  attached  to  the  algorithmic  body  of  DEL  surrogates  to 
serve  as  a phantom,  or  "skeletal"  contour.  Descriptor  components  that 
can  be  fixed  at  compilation  would  appear  as  literals  In  this  header; 
components  that  can  not  be  determined  until  block  entry  would  be 
parametrically  encoded  to  simplify  run  time  computation. 

Since  the  header  entries  need  be  evaluated  only  once  per  contour 
creation,  they  can  be  relatively  complex  and  difficult  to  evaluate. 
However,  this  factors  out  the  common  calculations  needed  to  compute 
effective  addresses;  there  will  be  a substantial  time  savings  whenever 
variables  are  accessed  repeatedly  within  a contour,  and  the  possibil- 
ity of  a time  loss  when  variables  are  not  accessed  at  all.  The 
penalty  can  be  avoided  by  marking  descriptors  In  the  current  contour 
as  "unbound"  until  they  are  actually  referenced.  Each  time  a DEL 
reference  Is  processed.  Its  descriptor  must  be  checked  for  validity; 
this  usually  means  that  some  form  of  hardware  support  Is  required  for 
this  stratagem  to  work  efficiently.  Lacking  a tagged  architecture.  It 
Is  likely  that  the  time  needed  to  decide  whether  a contour  element  Is 
a value  or  a descriptor  will  swamp  the  time  saved  by  sometimes  avoid- 
ing a main  store  access.  The  "tag"  In  this  case  Is  not  a type  field 


[: 
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concatenated  with  values  In  main  store,  but  rather  a "presence  flag" 
concatenated  with  the  descriptor/value  In  micro  store.  This  keeps  the 
number  of  tag  bits  low,  and  simplifies  host  Implementation.  Such  an 
explicit  caching  technique  should  be  evaluated  carefully  In  light  of 
the  specific  capabilities  of  the  given  host. 

The  contour  technique  Is  easily  adapted  to  most  existing  parame- 
ter passing  conventions.  Parameters  may  be  passed  "by  reference"  sim- 
ply by  copying  the  appropriate  descriptors  from  the  caller's  contour 
Into  the  callee's  contour.  Parameters  are  passed  "by  value"  by  Initi- 
alizing a variable  created  either  In  the  caller's  environment  (call  by 
copy  value),  or  In  the  callee's  environment  (call  by  value  copy),  %d.th 
the  value  of  the  argument  referand  In  the  caller's  contour.  "By  name" 
parameter  passing  Involves  moving  an  IP:EP  pair  Into  the  appropriate 
descriptor  In  the  callee  contour;  the  IP:EP,  where  IP  Is  an  Instruc- 
tion pointer  Into  the  time  Invariant  algorithm,  and  EP  Is  an  environ- 
ment pointer  Identifying  a particular  access  environment.  No 
transformation  Identified  by  the  IP  can  depend  upon  or  alter  the  con- 
tents of  a memory  cell  unless  that  cell  Is  In  the  address  mapping 
Image  of  the  current  access  environment. 
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Every  access  environment  contains  a declaration  array 
conceptually,  a linear  vector  of  address  map  definitions. 
In  the  declaration  array  Is  uniquely  associated  with  a 


that  Is, 
Each  entry 
particular 


1 


source  name,  and  completely  specifies  all  of  the  Information  needed  to 
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access  the  referand  of  that  name.  In  practice,  the  Contour  Model  Is 
usually  realized  In  terms  of  a t%K>  dimensional  reference  structure  of 
the  form  level :off set,  where  "level"  Is  associated  with  a lexical 
scope  of  definition  and  "offset"  Is  associated  trlth  the  physical  loca- 
tion of  a referand  (level  codes  are  also  called  segment  numbers,  block 
numbers,  page  numbers,  etc.;  and  offset  codes  are  sometimes  referred 
to  as  occurrence  numbers  or  placement  Indices) . 

Upon  entering  a scope,  a block  of  storage  Is  allocated  In  the 
memory  space  sufficient  to  contain  all  of  the  local  variables  known  to 
be  referenced  within  the  block.  During  compilation,  various  positions 
relative  to  the  beginning  of  this  block  are  preassigned  to  specific 
source  referands — thus  determining  the  offset  code  for  their  associ- 
ated references.  Storage  Is  usually  managed  by  partitioning  It  Into 
two  distinct  classes:  a LIFO  stack  that  contains  all  of  the  local 
referands  allocated  automatically  at  scope  entry;  and  a heap  that  con- 
tains all  referands  that  exist  Independently  of  the  normal  procedure 
entry/exit  mechanism. 

The  obvious  space  saving  aspect  of  linear  contours  Is  that  only  W 
bits  are  needed  to  Identify  an  arbitrary  DEL  variable.  Only  three  or 
four  bits  are  needed  to  encode  W within  the  DEL  program  status  vector 
so  that  It  could  easily  be  updated  each  time  the  environment  pointer 
Is  changed,  allowing  the  Inherent  locality  of  well  structured  source 
programs  to  be  exploited  In  a direct  manner.  This  method  Is  at  least 
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as  fast  as  the  display  vector  approach — and  may  well  be  more  efficient 
since  It  does  not  Incur r multiple  decode  overhead,  since  It  Involves 
only  a one  dimensional  Index. 

3. A. 3.  Operation  Contours 

Each  verb  or  operation  In  the  higher  level  language  Identifies  a 
corresponding  Interpretive  operator  In  the  DEL  program  representation 
(control  actions  may  be  treated  either  as  an  operation  or  as  a format 
type)  . The  routines  for  Interpreting  all  familiar  operations  are 
expected  to  lie  in  interpretive  storage.  Certain  unusual  operations, 
such  as  transcendental  functions,  may  not  always  be  contained  In  the 
interpret  storage.  A pointer  to  an  operator  translation  table  must  be 
part  of  the  environment;  the  actual  operations  used  are  Indicated  by  a 
small  Index  container  off  this  pointer  (Figure  11).  The  table  Is  also 
present  In  the  Interpretive  storage.  For  simple  languages,  this 
latter  step  is  probably  unnecessary  since  the  total  number  of  opera- 
tions may  be  easily  contained  In,  for  example,  a six  bit  field  and  the 
saving  in  DEL  program  representation  may  not  justify  the  added  inter- 

‘i 

pretive  step. 

In  general,  contours  could  be  established  for  DEL  blocks 
corresponding  to:  a single  source  operator;  an  Individual  sour:e 
statement;  a linear  segment  of  source  statements;  a source  clause;  a 
source  procedure;  or  the  entire  source  program.  Further  research  is 
required  to  determine  which  level  Is  space-time  optimal.  It  should  be 
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Figure  11:  Referencing  a DEL  Operator 


noted,  however,  that  loop  and  procedure  blocks  are  reasonable  choices 
for  contour  extents:  a significant  amount  of  non-trivial  sequential 
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other  mandatory  computations. 

3.4.4.  AN  EXAMPLE  AND  SCME  RESULTS  [8] 
Again  consider  the  previous  example: 

1 1 - I+l 

2 J - 

3 K - (J-1)*(K-I) 

This  might  be  Implemented  as: 


Statement  Implementation  Semantics 


4 2 2 2 


1 

ABA 

I 

1 

+ 

1 

:-  I+l 

2 

ABT 

J 

1 

- 

T 

:»  J-1 

TAB 

I 

J 

* 

J 

:-  T*I 

3 

ABT 

J 

1 

- 

T 

:-  J-1 

ABT 

K 

I 

- 

T 

:-  K-I 

TUA 

K 

* 

K 

:-  T*U 

where  T and  U are  the  top  and  next-to-top  (under  top)  stack  elements, 
respectively.  The  size.  In  bits,  of  each  Identifier  field  In  the 
first  Instruction  appears  directly  above  the  corresponding  mnemonic. 
Note  that  the  stack  Is  "pushed"  automatically  by  the  Instruction 
and  the  6^^  Instruction  "pops"  the  stack  for  further  use. 
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Our  GIF  rules  apply  directly  to  container  size — two  bits  are 
allowed  to  identify  the  four  variables  and  two  bits  are  used  for  the 
four  operations.  The  canonic  number  of  instructions  are  achieved,  as 
are  the  variable  and  operation  container  sizes;  however,  4 additional 
bits  per  instruction  are  needed  in  this  implementation  to  identify  the 
correct  format  (out  of  the  eleven  instruction  formats  discussed  in 
Theorem  1,  plus  four  additional  control  operators). 

There  is  a difference  between  the  transformational  completeness 
required  by  the  canonic  rules,  ana  the  achieved  transformational  com- 
pleteness. The  two  agree  only  for  statements  containing  at  most  one 
functional  operator — so  that  the  implementation  contains  an  additional 
J-identlfler  in  instruciton  3 and  an  additional  K-ldentlfier  in 
Instruction  6.  These  do  not,  however,  necessitate  additional  memory 
references  since  separate  domain  and  range  references  are  also 
required  in  the  GIF  if  a single  variable  is  used  both  as  a source  and 
sink  within  a given  statement.  The  comparison  with  the  GIF  measures 
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are  shown  below.  1; 


ACHIEVED  vs.  THEORETICAL  EFFICIENCY 


Number  of 
Instructions 
Operand  Identifiers 
Operator  Identifiers 
Memory  References 

Totals 


Achieved 

6 

11 

6 

2 (i.u.) 
12  (data) 

14  total 


CIF 

6 

9 

6 

1 (i.u.) 

12  (data) 

13  total 


Size  of 

Each  Identifier 
Total  Program 


Achieved 
2 bits 
58  bits 


CIF 
2 bits 
30  bits 


We  assume  that  32  bits  are  fetched  per  memory  reference  during 
the  instruction  fetch  portion  of  the  interpretation  process.  While 
the  program  size  has  grown  with  respect  to  CIF  measure,  it  is  still 
substantially  less  than  System  370  representation;  other  measures  are 
comparable  to  CIF. 


The  example  discussed  in  the  preceding  section  may  be  criticized 
as  being  non-typical  in  its  DEL  comparisons: 


f 
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I.  The  containers  are  quite  small,  thus  reducing  size 
size  measures  for  the  DEL  code. 

II.  Program  control  Is  not  Included. 

III.  The  program  reduction  In  space  may  come  at  the 
expense  of  host  machine  Interpretation  time. 

With  respect  to  the  first  criticism,  note  that  the  size  of  a pro- 
gram representation  grows  as  a log  function  of  the  number  of  variables 
and  operations  used  In  an  environment.  If  sixteen  variables  were 
used,  for  example,  program  size  would  Increase  by  50%  (to  90  bits). 
It  Is  even  more  Interesting,  however,  to  observe  what  happens  to  the 
same  three  statements  when  they  are  Interspersed  In  a larger  context 
with  perhaps  16  variables  and  20  statements  and  compiled  Into  System 
370  code.  The  size  of  the  object  code  produced  by  the  compiler  for 
either  optmlzed  or  unoptlmlzed  versions  Increases  by  almost  exactly 
the  same  50% — primarily  because  the  compiler  Is  unable  to  optimize 
variable  and  register  usage. 

The  absence  of  program  control  also  has  no  significant  statisti- 
cal affect.  A typical  FORTRAN  DO  or  IF  Is  compiled  Into  between  3 and 
9 System  370  Instructions  (assuming  a simple  IF  predicate)  depending 
upon  the  size  of  the  context  In  which  the  statement  occurs.  Thus,  the 
Inclusion  of  program  control  will  not  significantly  alter  the  statis- 
tics and  may  even  make  the  DEL  argument  more  favorable. 
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The  third  criticism  is  more  difficult  to  respond  to.  We  submit 
that  host  interpretation  time  should  not  be  noticeably  Increased  over 
a traditional  machine  Instruction  if  the  same  premises  are  made,  since 


i,  16  DEL  formats  must  be  contrasted  against  perhaps  6 or  8 Sys- 
tem 370  formats  (using  the  same  definition  of  format) — not  a 
significant  implementation  difference. 

ii.  Some  features  are  required  by  a 370  instruction  even  if  not 
required  by  the  instruction — e.g..  Indexing.  Name  completion 
through  base  registers  is  a similar  situation  since  the  base 
values  remain  the  same  over  several  instructions. 

ill.  Approximately  the  same  number  of  state  transitions  are 
required  for  either  a DEL  instruction  or  a traditional  machine 
instruction  if  each  is  referred  to  its  own  "well  mapped"  host 
interpreter.  In  fact,  for  an  unbiased  host  designed  for 

interpretation  the  interpretation  time  is  approximately  the 
same  for  either  a DEL  instruction  or  a System  370  instruction. 

The  language  DELtran,  upon  which  the  aforementioned  example  was 
based,  has  been  developed  as  a FORTRAN  DEL.  The  performance  and  vital 

I 

! statistics  of  DELtran  on  the  host  EMMY  [24]  is  Interesting,  especially 

! when  compared  to  the  370  performance  on  the  same  system.  The  table 

below  is  constructed  using  a version  of  the  well-known  Whetstone 
benchmark  and  widely  accepted  and  used  for  FORTRAN  machine  evaluation. 
The  EMMY  host  system  referred  to  in  the  table  is  a very  small 

i system — the  processor  consists  of  one  board  with  305  circuit  modules 

i 

j and  4096  32  bit  words  of  Interpretive  storage.  It  is  clear  that  the 

I DELtran  performance  is  significantly  superior  to  the  370  in  every 

i 

measure. 
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DELtran  vs.  System  370  Comparison  for  the  Whetstone  Benchmark 

Whetstone  Source  — 80  statements  (static) 

— 15,233  statements  (dynamic) 

— 8,624  bits  (excluding  comments) 


System  370 
FORTRAN -IV  opt  2 

DELtran 

ratio 

370/D ELtran 

Program  Size  (static) 

12,944  bits 

2,428  bits 

5.3: 1 

Instructions  Executed 

101,016  l.u. 

21,843  l.u. 

4.6:1 

Instructions/Statement 

6.6 

1.4 

4.6: 1 

Memory  References 

220,561  ref. 

46,939  ref. 

4.7:1 

EMMY  Execution  Time  0.70  sec. 

(370  emulation  approximates  360  Model  50) 

0. 14  sec. 

5:1 

Interpreter  Size 
(excludes  I/O) 

2, 100  words 

800  words 

2.6: 1 

Before  concluding,  a further  comparison  Is  In  order,  Wllner  [31] 
compares  the  S-language  for  FORTRAN  on  the  B-1700  as  offering  a 2:1 
space  Improvement  over  System  360  code.  The  FORTRAN  S-language 
Instruction  consists  of  a 3 or  9 bit  OP  code  container  followed  by 
operand  containers  of  (usually)  24  bits — split  as  descriptor,  segment 
and  displacement  (not  unlike  our  Interpretive  storage  entry).  The 
format  set  used  In  this  work  Is  of  limited  size,  and  does  not  possess 
* transformational  completeness.  However,  even  this  early  effort  offers 

« notlcable  Improvement  of  static  program  representation. 
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